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Preface 


This book has grown out of notes for a course that the second author has given for more 
years than he cares to remember - which, but for the first author who kept various versions, 
would never have come to this. Specifically, the Institute of Sound and Vibration Research 
(ISVR) at the University of Southampton has, for many years, run a Masters programme 
in Sound and Vibration, and more recently in Applied Digital Signal Processing. A course 
aimed at introducing students to signal processing has been one of the compulsory mod- 
ules, and given the wide range of students’ first degrees, the coverage needs to make few 
assumptions about prior knowledge - other than a familiarity with degree entry-level math- 
ematics. In addition to the Masters programmes the ISVR runs undergraduate programmes 
in Acoustical Engineering, Acoustics with Music, and Audiology, each of which to varying 
levels includes signal processing modules. These taught elements underpin the wide-ranging 
research of the ISVR, exemplified by the four interlinked research groups in Dynamics, 
Fluid Dynamics and Acoustics, Human Sciences, and Signal Processing and Control. The 
large doctoral cohort in the research groups attend selected Masters modules and an acquain- 
tance with signal processing is a ‘required skill’ (necessary evil?) in many a research project. 
Building on the introductory course there are a large number of specialist modules ranging 
from medical signal processing to sonar, and from adaptive and active control to Bayesian 
methods. 

It was in one of the PhD cohorts that Kihong Shin and Joe Hammond made each other’s 
acquaintance in 1994. Kihong Shin received his PhD from ISVR in 1996 and was then a 
postdoctoral research fellow with Professor Mike Brennan in the Dynamics Group, then 
joining the School of Mechanical Engineering, Andong National University, Korea, in 2002, 
where he is an associate professor. This marked the start of this book, when he began ‘editing’ 
loe Hammond’s notes appropriate to a postgraduate course he was lecturing - particularly 
appreciating the importance of including ‘hands-on’ exercises - using interactive MATLAB® 
examples. With encouragement from Professor Mike Brennan, Kihong Shin continued with 
this and it was not until 2004, when a manuscript landed on loe Hammond’s desk (some bits 
looking oddly familiar), that the second author even knew of the project - with some sutprise 
and great pleasure. 
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PREFACE 


In luly 2006, with the kind support and consideration of Professor Mike Brennan, Kihong 
Shin managed to take a sabbatical which he spent at the ISVR where his subtle pressures - 
including attending loe Hammond’s very last course on signal processing at the ISVR - have 
distracted loe Hammond away from his duties as Dean of the Faculty of Engineering, Science 
and Mathematics. 

Thus the text was completed. It is indeed an introduction to the subject and therefore the 
essential material is not new and draws on many classic books. What we have tried to do is 
to bring material together, hopefully encouraging the reader to question, enquire about and 
explore the concepts using the MATLAB exercises or derivatives of them. 

It only remains to thank all who have contributed to this. First, of course, the authors 
whose texts we have referred to, then the decades of students at the ISVR, and more recently 
in the School of Mechanical Engineering, Andong National University, who have shaped the 
way the course evolved, especially Sangho Pyo who spent a generous amount of time gath- 
ering experimental data. Two colleagues in the ISVR deserve particular gratitude: Professor 
Mike Brennan, whose positive encouragement for the whole project has been essential, to- 
gether with his very constructive reading of the manuscript; and Professor Paul White, whose 
encyclopaedic knowledge of signal processing has been our port of call when we needed 
reassurance. 

We would also like to express special thanks to our families, Hae-Ree Lee, Inyong Shin, 
Hakdoo Yu, Kyu-Shin Lee, Young-Sun Koo and Jill Hammond, for their never-ending support 
and understanding during the gestation and preparation of the manuscript. Kihong Shin is also 
grateful to Geun-Tae Yim for his continuing encouragement at the ISVR. 

Finally, loe Hammond thanks Professor Simon Braun of the Technion, Haifa, for his 
unceasing and inspirational leadership of signal processing in mechanical engineering. Also, 
and very importantly, we wish to draw attention to a new text written by Simon entitled 
Discover Signal Processing: An Interactive Guide for Engineers, also published by John 
Wiley & Sons, which offers a complementary and innovative learning experience. 

Please note that MATLAB codes (m files) and data files can be downloaded from the 
Companion Website at www.wiley.com/go/shin_hammond 


Kihong Shin 
Joseph Kenneth Hammond 


About the Authors 


Joe Hammond Joseph (Joe) Hammond graduated in Aeronautical Engineering in 1966 at 
the University of Southampton. He completed his PhD in the Institute of Sound and Vibration 
Research (ISVR) in 1972 whilst a lecturer in the Mathematics Department at Portsmouth 
Polytechnic. He returned to Southampton in 1978 as a lecturer in the ISVR, and was later 
Senior lecturer, Professor, Deputy Director and then Director of the ISVR from 1992-2001. 
In 2001 he became Dean of the Faculty of Engineering and Applied Science, and in 2003 
Dean of the Faculty of Engineering, Science and Mathematics. He retired in July 2007 and is 
an Emeritus Professor at Southampton. 

Kihong Shin Kihong Shin graduated in Precision Mechanical Engineering from Hanyang 
University, Korea in 1989. After spending several years as an electric motor design and NVH 
engineer in Samsung Electro-Mechanics Co., he started an MSc at Cranfield University in 
1992, on the design of rotating machines with reference to noise and vibration. Following 
this, he joined the ISVR and completed his PhD on nonlinear vibration and signal processing 
in 1996. In 2000, he moved back to Korea as a contract Professor of Hanyang University. In 
Mar. 2002, he joined Andong National University as an Assistant Professor, and is currently 
an Associate Professor. 


1 

Introduction to Signal Processing 


Signal processing is the name given to the procedures used on measured data to reveal the 
information contained in the measurements. These procedures essentially rely on various 
transformations that are mathematically based and which are implemented using digital tech- 
niques. The wide availability of software to carry out digital signal processing (DSP) with 
such ease now pervades all areas of science, engineering, medicine, and beyond. This ease 
can sometimes result in the analyst using the wrong tools - or interpreting results incorrectly 
because of a lack of appreciation or understanding of the assumptions or limitations of the 
method employed. 

This text is directed at providing a user’s guide to linear system identification. In order 
to reach that end we need to cover the groundwork of Fourier methods, random processes, 
system response and optimization. Recognizing that there are many excellent texts on this, 1 
why should there be yet another? The aim is to present the material from a user’s viewpoint. 
Basic concepts are followed by examples and structured MATLAB® exercises allow the user 
to ‘experiment’. This will not be a story with the punch-line at the end - we actually start in 
this chapter with the intended end point. 

The aim of doing this is to provide reasons and motivation to cover some of the underlying 
theory. It will also offer a more rapid guide through methodology for practitioners (and others) 
who may wish to ‘skip’ some of the more ‘tedious’ aspects. In essence we are recognizing 
that it is not always necessary to be fully familiar with every aspect of the theory to be an 
effective practitioner. But what is important is to be aware of the limitations and scope of one’s 
analysis. 


1 See for example Bendat and Piersol (2000), Brigham (1988), Hsu (1970), Jenkins and Watts (1968), Oppenheim 
and Schafer (1975), Otnes and Enochson (1978), Papoulis (1977). Randall (1987), etc. 


Fundamentals of Signal Processing for Sound and Vibration Engineers 
K. Shin and J. K. Hammond. © 2008 John Wiley & Sons, Ltd 


2 


INTRODUCTION TO SIGNAL PROCESSING 


The Aim of the Book 

We are assuming that the reader wishes to understand and use a widely used approach to 
‘system identification'. By this we mean we wish to be able to characterize a physical process 
in a quantified way. The object of this quantification is that it reveals information about the 
process and accounts for its behaviour, and also it allows us to predict its behaviour in future 
environments. 

The ‘physical processes’ could be anything, e.g. vehicles (land, sea, air), electronic 
devices, sensors and actuators, biomedical processes, etc., and perhaps less ‘physically based’ 
socio-economic processes, and so on. The complexity of such processes is unlimited - and 
being able to characterize them in a quantified way relies on the use of physical ‘laws’ or other 
‘models’ usually phrased within the language of mathematics. Most science and engineering 
degree programmes are full of courses that are aimed at describing processes that relate to the 
appropriate discipline. We certainly do not want to go there in this book - life is too short! 
But we still want to characterize these systems - with the minimum of effort and with the 
maximum effect. 

This is where ‘system theory’ comes to our aid, where we employ descriptions or mod- 
els - abstractions from the ‘real thing’ - that nevertheless are able to capture what may be 
fundamentally common, to large classes of the phenomena described above. In essence what 
we do is simply to watch what ‘a system’ does. This is of course totally useless if the system 
is ‘asleep’ and so we rely on some form of activation to get it going - in which case it is 
logical to watch (and measure) the particular activation and measure some characteristic of 
the behaviour (or response) of the system. 

In ‘normal’ operation there may be many activators and a host of responses. In most 
situations the activators are not separate discernible processes, but are distributed. An example 
of such a system might be the acoustic characteristics of a concert hall when responding to 
an orchestra and singers. The sources of activation in this case are the musical instruments 
and singers, the system is the auditorium, including the members of the audience, and the 
responses may be taken as the sounds heard by each member of the audience. 

The complexity of such a system immediately leads one to try and conceptualize 
something simpler. Distributed activation might be made more manageable by Tumping" 
things together, e.g. a piano is regarded as several separate activators rather than continu- 
ous strings/sounding boards all causing acoustic waves to emanate from each point on their 
surfaces. We might start to simplify things as in Figure 1.1. 

This diagram is a model of a greatly simplified system with several actuators - and the 
several responses as the sounds heard by individual members of the audience. The arrows 
indicate a ‘cause and effect’ relationship - and this also has implications. For example, the 
figure implies that the ‘activators’ are unaffected by the ‘responses’. This implies that there is 
no ‘feedback’ - and this may not be so. 


Activators 


System 


Responses 


Figure 1.1 Conceptual diagram of a simplified system 
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x(t) 


System 



Figure 1.2 A single activator and a single response system 


Having got this far let us simplify things even further to a single activator and a single 
response as shown in Figure 1.2. This may be rather ‘distant’ from reality but is a widely used 
model for many processes. 

It is now convenient to think of the activator x(t) and the response y(t) as time histories. 
For example, x(t) may denote a voltage, the system may be a loudspeaker and y(t') the pressure 
at some point in a room. However, this time history model is just one possible scenario. The 
activator x may denote the intensity of an image, the system is an optical device and y may 
be a transformed image. Our emphasis will be on the time history model generally within a 
sound and vibration context. 

The box marked ‘System’ is a convenient catch-all term for phenomena of great variety 
and complexity. From the outset, we shall impose major constraints on what the box rep- 
resents - specifically systems that are linear 2 and time invariant . 3 Such systems are very 
usefully described by a particular feature, namely their response to an ideal impulse, 4 and 
their corresponding behaviour is then the impulse response . 3 We shall denote this by the 
symbol h(t). 

Because the system is linear this rather ‘abstract’ notion turns out to be very useful 
in predicting the response of the system to any arbitrary input. This is expressed by the 
convolution 6 of input x(t ) and system h{t) sometimes abbreviated as 

y(t) = h(t) * x(t) (1.1) 

where denotes the convolution operation. Expressed in this form the system box is filled 
with the characterization h(t) and the (mathematical) mapping or transformation from the 
input x(f) to the response v(t) is the convolution integral. 

System identification now becomes the problem of measuring x(f ) and y(t ) and deducing 
the impulse response function h(t). Since we have three quantitative terms in the relationship 
(1.1), but (assume that) we know two of them, then, in principle at least, we should be able to 
find the third. The question is: how? 

Unravelling Equation (1 . 1) as it stands is possible but not easy. Life becomes considerably 
easier if we apply a transformation that maps the convolution expression to a multiplication. 
One such transformation is the Fourier transform . 7 Taking the Fourier transform of the 
convolution 8 in Equation (1.1) produces 

Y(f)= H(f)X(f) (1.2) 


* Words in bold will be discussed or explained at greater length later. 

2 See Chapter 4, Section 4.7. 

3 See Chapter 4, Section 4.7. 

4 See Chapter 3, Section 3.2, and Chapter 4, Section 4.7. 

5 See Chapter 4, Section 4.7. 

6 See Chapter 4, Section 4.7. 

7 See Chapter 4, Sections 4.1 and 4.4. 

8 See Chapter 4, Sections 4.4 and 4.7. 
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where / denotes frequency, and X(f), H(f) and Y{f) are the transforms of jc(0, h(t) and 
y(t ). This achieves the unravelling of the input-output relationship as a straightforward mul- 
tiplication - in a ‘domain’ called the frequency domain . 9 In this form the system is char- 
acterized by the quantity //(/) which is called the system frequency response function 
(FRF ). 10 

The problem of ‘system identification’ now becomes the calculation of H{f), which 
seems easy: that is, divide Y{f) by X(f), i.e. divide the Fourier transform of the output by the 
Fourier transform of the input. As long as X(f) is never zero this seems to be the end of the 
story - but, of course, it is not. Reality interferes in the form of ‘ uncertainty ’ . The measurements 
x(f) and y(t) are often not measured perfectly - disturbances or ‘noise’ contaminates them - 
in which case the result of dividing two transforms of contaminated signals will be of limited 
and dubious value. 

Also, the actual excitation signal x(f) may itself belong to a class of random 11 signals - 
in which case the straightforward transformation (1.2) also needs more attention. It is this 
‘dual randomness’ of the actuating (and hence response) signal and additional contamination 
that is addressed in this book. 

The Effect of Uncertainty 

We have referred to randomness or uncertainty with respect to both the actuation and response 
signal and additional noise on the measurements. So let us redraw Figure 1.2 as in Figure 1.3. 


x(0 J ► 

%(t) — 1 ►(+) 

1 

System 

j ►HO 

O'* - vo 
1 

T 

■t.W 


T 

y m { 0 


Figure 1.3 A single activator/response model with additive noise on measurements 

In Figure 1.3, x and y denote the actuation and response signals as before - which may 
themselves be random. We also recognize that x and y are usually not directly measurable and 
we model this by including disturbances written as n x and n y which add to x and y - so that 
the actual measured signals are x m and y m . Now we get to the crux of the system identification: 
that is, on the basis of (noisy) measurements x m and y m , what is the system? 

We conceptualize this problem pictorially. Imagine plotting y m against x m (ignore for 
now what x m and y m might be) as in Figure 1 .4. 

Each point in this figure is a ‘representation’ of the measured response y m corresponding 
to the measured actuation x,„. 

System identification, in this context, becomes one of establishing a relationship between 
y m and x m such that it somehow relates to the relationship between y and x . The noises are a 

9 See Chapter 2, Section 2.1. 

10 See Chapter 4, Section 4.7. 

11 See Chapter 7, Section 7.2. 
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Figure 1.4 A plot of the measured signals y m versus x m 


nuisance, but we are stuck with them. This is where ‘optimization’ comes in. We try and find 
a relationship between x,„ and y m that seeks a ‘systematic’ link between the data points which 
suppresses the effects of the unwanted disturbances. 

The simplest conceptual idea is to ‘fit’ a linear relationship between x,„ and y m . Why 
linear? Because we are restricting our choice to the simplest relationship (we could of course 
be more ambitious). The procedure we use to obtain this fit is seen in Figure 1.5 where the 
slope of the straight line is adjusted until the match to the data seems best. 

This procedure must be made systematic - so we need a measure of how well we fit the 
points. This leads to the need for a specific measure of fit and we can choose from an unlimited 
number. Let us keep it simple and settle for some obvious ones. In Figure 1.5, the closeness 
of the line to the data is indicated by three measures e y , e x and e r . These are regarded as 
errors which are measures of the ‘failure" to fit the data. The quantity e y is an error in the y 
direction (i.e. in the output direction). The quantity e x is an error in the x direction (i.e. in the 
input direction). The quantity ej is orthogonal to the line and combines errors in both x and 
y directions. 

We might now look at ways of adjusting the line to minimize e y , e x , ej or some conve- 
nient ‘function’ of these quantities. This is now phrased as an optimization problem. A most 
convenient function turns out to be an average of the squared values of these quantities (‘con- 
venience’ here is used to reflect not only physical meaning but also mathematical ‘niceness’). 
Minimizing these three different measures of closeness of fit results in three correspondingly 
different slopes for the straight line; let us refer to the slopes as m y , m x ,m T . So which one 
should we use as the best? The choice will be strongly influenced by our prior knowledge of 
the nature of the measured data - specifically whether we have some idea of the dominant 
causes of error in the departure from linearity. In other words, some knowledge of the relative 
magnitudes of the noise on the input and output. 



Figure 1.5 A linear fit to measured data 
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We could look to the figure for a guide: 

• m y seems best when errors occur on y, i.e. errors on output e y \ 

• m x seems best when errors occur on x, i.e. errors on input e x ; 

• in t seems to make an attempt to recognize that errors are on both, i.e. ej. 

We might now ask how these rather simple concepts relate to ‘identifying’ the system in 
Figure 1.3. It turns out that they are directly relevant and lead to three different estimators 
for the system frequency response function H(f). They have come to be referred to in the 
literature by the notation Hz(f) and H T (f ), 12 and are the analogues of the slopes m y , 

m x , mj, respectively. 

We have now mapped out what the book is essentially about in Chapters 1 to 10. The 
book ends with a chapter that looks into the implications of multi-input/output systems. 


1.1 DESCRIPTIONS OF PHYSICAL DATA (SIGNALS) 

Observed data representing a physical phenomenon will be referred to as a time history or a 
signal. Examples of signals are: temperature fluctuations in a room indicated as a function of 
time, voltage variations from a vibration transducer, pressure changes at a point in an acoustic 
field, etc. The physical phenomenon under investigation is often translated by a transducer 
into an electrical equivalent (voltage or current) and if displayed on an oscilloscope it might 
appear as shown in Figure 1.6. This is an example of a continuous (or analogue) signal. 

In many cases, data are discrete owing to some inherent or imposed sampling procedure. 
In this case the data might be characterized by a sequence of numbers equally spaced in time. 
The sampled data of the signal in Figure 1.6 are indicated by the crosses on the graph shown 
in Figure 1.7. 



Volts 



/> (' 



A seconds 






Time (seconds) 


Figure 1.7 A discrete signal sampled at every A seconds (marked with x) 


12 


See Chapter 9, Section 9.3. 
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Road height 

(h) 


x ' Spatial position if) 

Figure 1.8 An example of a signal where time is not the natural independent variable 


For continuous data we use the notation x(t), y(t), etc., and for discrete data various 
notations are used, e.g. x(n A), x(n), x„ (n = 0, 1, 2, . . . ). 

In certain physical situations, ‘time’ may not be the natural independent variable; for 
example, a plot of road roughness as a function of spatial position, i.e. h(f) as shown in 
Figure 1.8. However, for uniformity we shall use time as the independent variable in all our 
discussions. 


1.2 CLASSIFICATION OF DATA 

Time histories can be broadly categorized as shown in Figure 1.9 (chaotic signals are added to 
the classifications given by Bendat and Piersol, 2000). A fundamental difference is whether a 
signal is deterministic or random , and the analysis methods are considerably different depend- 
ing on the ‘type’ of the signal. Generally, signals are mixed, so the classifications of Figure 1.9 
may not be easily applicable, and thus the choice of analysis methods may not be apparent. In 
many cases some prior knowledge of the system (or the signal) is very helpful for selecting an 
appropriate method. However, it must be remembered that this prior knowledge (or assump- 
tion) may also be a source of misleading the results. Thus it is important to remember the First 
Principle of Data Reduction (Abies, 1974) 

The result of any transformation imposed on the experimental data shall incorporate and be 
consistent with all relevant data and be maximally non-committal with regard to unavailable 
data. 

It would seem that this statement summarizes what is self-evident. But how often do we 
contravene it - for example, by ‘assuming’ that a time history is zero outside the extent of a 
captured record? 


Signals 


Deterministic Random 


Periodic Non-periodic i Stationary Non-stationary 


Sinusoidal Complex Almost Transient (Chaotic) 
periodic periodic 


Figure 1.9 Classification of signals 
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Figure 1.10 A simple mass— spring system 


Nonetheless, we need to start somewhere and signals can be broadly classified as being 
either deterministic or non-deterministic {random). Deterministic signals are those whose 
behaviour can be predicted exactly. As an example, a mass-spring oscillator is considered in 
Figure 1.10. The equation of motion is mx + kx = 0 {x is displacement and x is acceleration). 
If the mass is released from rest at a position x(t) = A and at time t = 0, then the displacement 
signal can be written as 


x(t) = A cos 



t > 0 


(1.3) 


In this case, the displacement x(t) is known exactly for all time. Various types of deter- 
ministic signals will be discussed later. Basic analysis methods for deterministic signals are 
covered in Part I of this book. Chaotic signals are not considered in this book. 

Non-deterministic signals are those whose behaviour cannot be predicted exactly. Some 
examples are vehicle noise and vibrations on a road, acoustic pressure variations in a wind 
tunnel, wave heights in a rough sea, temperature records at a weather station, etc. Various 
terminologies are used to describe these signals, namely random processes (signals), stochastic 
processes, time series, and the study of these signals is called time series analysis. Approaches 
to describe and analyse random signals require probabilistic and statistical methods. These 
are discussed in Part II of this book. 

The classification of data as being deterministic or random might be debatable in many 
cases and the choice must be made on the basis of knowledge of the physical situation. Often 
signals may be modelled as being a mixture of both, e.g. a deterministic signal ‘embedded’ 
in unwanted random disturbances (noise). 

In general, the purpose of signal processing is the extraction of information from a 
signal, especially when it is difficult to obtain from direct observation. The methodology of 
extracting information from a signal has three key stages: (i) acquisition, (ii) processing, (iii) 
interpretation. To a large extent, signal acquisition is concerned with instrumentation, and we 
shall treat some aspects of this, e.g. analogue-to-digital conversion . 13 However, in the main, 
we shall assume that the signal is already acquired, and concentrate on stages (ii) and (iii). 
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See Chapter 5, Section 5.3. 
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Figure 1.11 A laboratory setup 


Some ‘Real’ Data 

Let us now look at some signals measured experimentally. We shall attempt to fit the observed 
time histories to the classifications of Figure 1.9. 

(a) Figure 1.11 shows a laboratory setup in which a slender beam is suspended verti- 
cally from a rigid clamp. Two forms of excitation are shown. A small piezoceramic PZT 
(Piezoelectric Zirconate Titanate) patch is used as an actuator which is bonded on near the 
clamped end. The instrumented hammer (impact hammer) is also used to excite the structure. 
An accelerometer is attached to the beam tip to measure the response. We shall assume here 
that digitization effects (ADC quantization, aliasing ) 14 have been adequately taken care of 
and can be ignored. A sharp tap from the hammer to the structure results in Figures 1.12(a) 
and (b). Relating these to the classification scheme, we could reasonably refer to these as de- 
terministic transients. Why might we use the deterministic classification? Because we expect 
replication of the result for ‘identical’ impacts. Further, from the figures the signals appear to 
be essentially noise free. From a systems points of view, Figure 1.12(a) is x(t) and 1.12(b) is 
y(t) and from these two signals we would aim to deduce the characteristics of the beam. 

(b) We now use the PZT actuator, and Figures 1.13(a) and (b) now relate to a random 
excitation. The source is a band-limited , 15 stationary , 16 Gaussian process , 17 and in the 
steady state (i.e. after starting transients have died down) the response should also be stationary. 
However, on the basis of the visual evidence the response is not evidently stationary (or is it?), 
i.e. it seems modulated in some way. This demonstrates the difficulty in classification. As it 

14 See Chapter 5, Sections 5. 1-5. 3. 

15 See Chapter 5, Section 5.2, and Chapter 8, Section 8.7. 

16 See Chapter 8, Section 8.3. 

17 See Chapter 7, Section 7.3. 
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(a) Impact signal measured from the force sensor (impact hammer) 



(b) Response signal to the impact measured from the accelerometer 
Figure 1.12 Example of deterministic transient signals 


happens, the response is a narrow-band stationary random process (due to the filtering action 
of the beam) which is characterized by an amplitude-modulated appearance. 

(c) Let us look at a signal from a machine rotating at a constant rate. A tachometer signal 
is taken from this. As in Figure 1.14(a), this is one that could reasonably be classified as 
periodic, although there are some discernible differences from period to period - one might 
ask whether this is simply an additive low-level noise. 

(d) Another repetitive signal arises from a telephone tone shown in Figure 1.14(b). The 
tonality is ‘evident" from listening to it and its appearance is ‘roughly’ periodic; it is tempting 
to classify these signals as ‘almost periodic’! 

(e) Figure 1.15(a) represents the signal for a transformer ‘hum’, which again perceptually 
has a repetitive but complex structure and visually appears as possibly periodic with additive 
noise - or (perhaps) narrow-band random. 
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t (seconds) 

(a) Input random signal to the PZT (actuator) patch 



t (seconds) 

(b) Response signal to the random excitation measured from the accelerometer 


Figure 1.13 Example of stationary random signals 


Figure 1.15(b) is a signal created by adding noise (broadband) to the telephone tone 
signal in Figure 1.14(b). It is not readily apparent that Figure 1.15(b) and Figure 1.15(a) are 
‘structurally’ very different. 

(f) Figure 1.16(a) is an acoustic recording of a helicopter flyover. The non-stationary 
structure is apparent - specifically, the increase in amplitude with reduction in range. What 
is not apparent are any other more complex aspects such as frequency modulation due to 
movement of the source. 

(g) The next group of signals relate to practicalities that occur during acquisition that 
render the data of limited value (in some cases useless!). 

The jagged stepwise appearance in Figure 1.17 is due to quantization effects in the ADC- 
apparent because the signal being measured is very small compared with the voltage range of 
the ADC. 
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Figure 1.14 Example of periodic (and almost periodic) signals 


(h) Figures 1.18(a), (b) and (c) all display flats at the top and bottom (positive and 
negative) of their ranges. This is characteristic of ‘clipping’ or saturation. These have been 
synthesized by clipping the telephone signal in Figure 1 . 1 4(b), the band-limited random signal 
in Figure 1.13(a) and the accelerometer signal in Figure 1.12(b). Clipping is a nonlinear effect 
which ‘creates’ spurious frequencies and essentially destroys the credibility of any Fourier 
transformation results. 

(i) Lastly Figures 1.19(a) and (b) show what happens when ‘control’ of an experiment 
is not as tight as it might be. Both signals are the free responses of the cantilever beam shown 
in Figure 1.11. Figure 1.19(a) shows the results of the experiment performed on a vibration- 
isolated optical table. The signal is virtually noise free. Figure 1.19(b) shows the results of the 
same experiment, but performed on a normal bench-top table. The signal is now contaminated 
with noise that may come from various external sources. Note that we may not be able to 
control our experiments as carefully as in Figure 1.19(a), but, in fact, it is a signal as in 


x(t) (volts) x(t) (volts) x(t) (volts) 
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(a) Transformer ‘hum’ noise 



Figure 1.15 Example of periodic signals with additive noise 



t (seconds) 


Figure 1.16 Example of a non-stationary signal (helicopter flyover noise) 
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Figure 1.17 Example of low dynamic range 

Figure 1.19(b) which we often deal with. Thus, the nature of uncertainty in the measurement 
process is again emphasized (see Figure 1 .3). 

The Next Stage 

Having introduced various classes of signals we can now turn to the principles and details 
of how we can model and analyse the signals. We shall use Fourier-based methods - that 
is, we essentially model the signal as being composed of sine and cosine waves and tailor 
the processing around this idea. We might argue that we are imposing/assuming some prior 
information about the signal - namely, that sines and cosines are appropriate descriptors. Whilst 
this may seem constraining, such a ‘prior model’ is very effective and covers a wide range of 
phenomena. This is sometimes referred to as a non-parametric approach to signal processing. 

So, what might be a ‘parametric’ approach? This can again be related to modelling. We 
may have additional ‘prior information’ as to how the signal has been generated, e.g. a result of 
filtering another signal. This notion may be extended from the knowledge that this generation 
process is indeed ‘physical’ to that of its being ‘notional’, i.e. another model. Specifically 
Figure 1.20 depicts this when s(t) is the ‘measured’ signal, which is conceived to have arisen 
from the action of a system being driven by a very fundamental signal - in this case so-called 
white noise 18 w(t). 

Phrased in this way the analysis of the signal s(t) can now be transformed into a problem of 
determining the details of the system. The system could be characterized by a set of parameters, 
e.g. it might be mathematically represented by differential equations and the parameters are the 
coefficients. Set up like this, the analysis of s(t) becomes one of system parameter estimation - 
hence this is a parametric approach. 

The system could be linear, time varying or nonlinear depending on one’s prior knowl- 
edge, and could therefore offer advantages over Fourier-based methods. However, we shall 
not be pursuing this approach in this book and will get on with the Fourier-based methods 
instead. 
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See Chapter 8, Section 8.6. 
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(a) Clipped (almost) periodic signal 




Figure 1.18 Examples of clipped signals 
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t (seconds) 

(a) Signal is measured on the optical table (fitted with a vibration isolator) 



Figure 1.19 Examples of experimental noise 


w(t) 


System 


40 


Figure 1.20 A white-noise-excited system 


We have emphasized that this is a book for practitioners and users of signal processing, 
but note also that there should be sufficient detail for completeness. Accordingly we have 
chosen to highlight some main points using a light grey background. From Chapter 3 onwards 
there is a reasonable amount of mathematical content; however, a reader may wish to get 
to the main points quickly, which can be done by using the highlighted sections. The details 
supporting these points are in the remainder of the chapter adjacent to these sections and in the 
appendices. Examples and MATLAB exercises illustrate the concepts. A superscript notation 
is used to denote the relevant MATLAB example given in the last section of the chapter, e.g. 
see the superscript ( M21 ) in page 21 for MATLAB Example 2.1 given in page 26. 


Part I 

Deterministic Signals 


2 


Classification of Deterministic Data 


Introduction 


As described in Chapter 1, deterministic signals can be classified as shown in Figure 2. 1 . In 
this figure, chaotic signals are not considered and the sinusoidal signal and more general 
periodic signals are dealt with together. So deterministic signals are now classified as 
periodic, almost periodic and transient, and some basic characteristics are explained 
below. 

Deterministic 


Periodic Non-periodic 


Almost periodic Transient 

Figure 2.1 Classification of deterministic signals 


2.1 PERIODIC SIGNALS 

Periodic signals are defined as those whose waveform repeats exactly at regular time intervals. 
The simplest example is a sinusoidal signal as shown in Figure 2.2(a), where the time interval 
for one full cycle is called the period T P (in seconds) and its reciprocal 1 /T P is called the 
frequency (in hertz). Another example is a triangular signal (or sawtooth wave), as shown in 
Figure 2.2(b). This signal has an abrupt change (or discontinuity) every T P seconds. A more 
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(c) General periodic signal 
Figure 2.2 Examples of periodic signals 


general periodic signal is shown in Figure 2.2(c) where an arbitrarily shaped waveform repeats 
with period T P . 

In each case the mathematical definition of periodicity implies that the behaviour of the 
wave is unchanged for all time. This is expressed as 

x(t) = x(t + nT P ) n = ±1, ±2, ±3,... (2.1) 

For cases (a) and (b) in Figure 2.2, explicit mathematical descriptions of the wave are easy 
to write, but the mathematical expression for the case (c) is not obvious. The signal (c) may 
be obtained by measuring some physical phenomenon, such as the output of an accelerometer 
placed near the cylinder head of a constant speed car engine. In this case, it may be more 
useful to consider the signal as being made up of simpler components. One approach to this 
is to ‘transform’ the signal into the ‘frequency domain’ where the details of periodicities of 
the signal are clearly revealed. In the frequency domain, the signal is decomposed into an 
infinite (or a finite) number of frequency components. The periodic signals appear as discrete 
components in this frequency domain, and are described by a Fourier series which is discussed 
in Chapter 3. As an example, the frequency domain representation of the amplitudes of the 
triangular wave (Figure 2.2(b)) with a period of T P = 2 seconds is shown in Figure 2.3. 
The components in the frequency domain consist of the fundamental frequency 1 /T P and its 
harmonics 2/T P , 3 /T P , . . . , i.e. all frequency components are ‘harmonically related'. 

However, there is hardly ever a perfect periodic signal in reality even if the signal is 
carefully controlled. For example, almost all so-called periodic signals produced by a signal 
generator used in sound and vibration engineering are not perfectly periodic owing to the 
limited precision of the hardware and noise. An example of this may be a telephone keypad 
tone that usually consists of two frequency components (assume the ratio of the two frequencies 
is a rational number — see Section 2.2). The measured time data of the telephone tone of keypad 
‘8’ are shown in Figure 2.4(a), where it seems to be a periodic signal. However, when it is 
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Figure 2.3 Frequency domain representation of the amplitudes of a triangular wave with a period of 

T = 2 

ip — ^ 


transformed into the frequency domain, we may find something different. The telephone tone 
of keypad ‘8’ is designed to have frequency components at 852 Hz and 1336 Hz only. This 
measured telephone tone is transformed into the frequency domain as shown in Figures 2.4(b) 
(linear scale) and (c) (log scale). On a linear scale, it seems to be composed of the two 
frequencies. However, there are in fact, many other frequency components that may result if 
the signal is not perfectly periodic, and this can be seen by plotting the transform on a log 
scale as in Figure 2.4(c). 

Another practical example of a signal that may be considered to be periodic is transformer 
hum noise (Figure 2.5(a)) whose dominant frequency components are about 122 Hz, 366 Hz 
and 488 Hz, as shown in Figure 2.5(b). From Figure 2.5(a), it is apparent that the signal is 
not periodic. However, from Figure 2.5(b) it is seen to have a periodic structure contaminated 
with noise. 

From the above two practical examples, we note that most periodic signals in practical 
situations are not ‘truly’ periodic, but are ‘almost’ periodic. The term ‘almost periodic’ is 
discussed in the next section. 


2 2 ALMOST PERIODIC SIGNALS^^ ^ (This superscript is short for MATLAB Example 2.1) 

The name ‘almost periodic’ seems self-explanatory and is sometimes called quasi-periodic, 
i.e. it looks periodic but in fact it is not if observed closely. We shall see in Chapter 3 that 
suitably selected sine and cosine waves may be added together to represent cases (b) and (c) 
in Figure 2.2. Also, even for apparently simple situations the sum of sines and cosines results 
in a wave which never repeats itself exactly. As an example, consider a wave consisting of 
two sine components as below 


x(t) = A \ sin(27rpit + £fi) + A 2 sin (2np2t + # 2 ) 


(2.2) 


|A'(/)| (log scale, volts/Hz) \X{f)\ (linear scale, volts/Hz) x (t) (volts) 
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(a) Time history 

0.6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

0.5 - 
0.4 - 
0.3 - 
0.2 - 
0.1 - 

0 1 1 1 1 1 1 1 j Li 1 1 1 1 lLI 1 1 1 1 1 1 

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 

Frequency (kHz) 

(b) Frequency components (linear scale) 



(c) Frequency components (log scale) 


Figure 2.4 Measured telephone tone (No. 8) signal considered as periodic 




ALMOST PERIODIC SIGNALS 


23 



(a) Time history 



Figure 2.5 Measured transformer hum noise signal 


where A i and As are amplitudes, p\ and p 2 are the frequencies of each sine component, and 
9\ and 9i are called the phases. If the frequency ratio p\/ P 2 is a rational number, the signal 
x(t) is periodic and repeats at every time interval of the smallest common period of both l/pi 
and 1 / p 2 . However, if the ratio pi/p 2 is irrational (as an example, the ratio pi/ pi = 2/\/2 is 
irrational), the signal x(f) never repeats. It can be argued that the sum of two or more sinusoidal 
components is periodic only if the ratios of all pairs of frequencies are found to be rational 
numbers (i.e. ratio of integers). A possible example of an almost periodic signal may be an 
acoustic signal created by tapping a slightly asymmetric wine glass. 

However, the representation (model) of a signal as the addition of simpler (sinusoidal) 
components is very attractive - whether the signal is truly periodic or not. In fact a method 
which predated the birth of Fourier analysis uses this idea. This is the so-called Prony series 
(de Prony, 1795; Spitznogle and Quazi, 1970; Kay and Marple, 1981; Davies, 1983). The 
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basic components here have the form Ae~ at sin(<wr + (j> ) in which there are four parameters 
for each component - namely, amplitude A, frequency at, phase ij> and an additional feature a 
which controls the decay of the component. 

Prony analysis fits a sum of such components to the data using an optimization proce- 
dure. The parameters are found from a (nonlinear) algorithm. The nonlinear nature of the 
optimization arises because (even if a = 0) the frequency a> is calculated for each component. 
This is in contrast to Fourier methods where the frequencies are fixed once the period T P is 
known, i.e. only amplitudes and phases are calculated. 


2.3 TRANSIENT SIGNALS 

The word ‘transient’ implies some limitation on the duration of the signal. Generally speaking, 
a transient signal has the property that x(t ) = 0 when t ±oo; some examples are shown 
in Figure 2.6. In vibration engineering, a common practical example is impact testing (with a 
hammer) to estimate the frequency response function (FRF, see Equation (1.2)) of a structure. 
The measured input force signal and output acceleration signal from a simple cantilever beam 
experiment are shown in Figure 2.7. The frequency characteristic of this type of signal is 
very different from the Fourier series. The discrete frequency components are replaced by the 
concept of the signal containing a continuum of frequencies. The mathematical details and 
interpretation in the frequency domain are presented in Chapter 4. 

Note also that the modal characteristics of the beam allow the transient response to be 
modelled as the sum of decaying oscillations, i.e. ideally matched to the Prony series. This 
allows the Prony model to be ‘fitted to’ the data (see Davies, 1983) to estimate the amplitudes, 
frequencies, damping and phases, i.e. a parametric approach. 


2.4 BRIEF SUMMARY AND CONCLUDING REMARKS 

1. Deterministic signals are largely classified as periodic, almost periodic and transient 
signals. 

2. Periodic and almost periodic signals have discrete components in the frequency 
domain. 

3. Almost periodic signals may be considered as periodic signals having an infinitely 
long period. 

4. Transient signals are analysed using the Fourier integral (see Chapter 4). 

Chapters 1 and 2 have been introductory and qualitative. We now add detail to these 
descriptions and note again that a quick ‘skip-through’ can be made by following 
the highlighted sections. MATLAB examples are also presented with enough de- 
tail to allow the reader to try them and to understand important features (MATLAB 
version 7.1 is used, and Signal Processing Toolbox is required for some MATLAB 
examples). 




y{t) (volts) x(t) (volts) 
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Figure 2.6 Examples of transient signals 


0.9 r 
0.8 - 
0.7 - 
0.6 - 
0.5 - 
0.4 - 
0.3 - 
0.2 - 
0.1 
0 • 
-o.i 


0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 

t (seconds) 

(a) Signal from the force sensor (impact hammer) 



Figure 2.7 Practical examples of transient signals (measured from an impact testing experiment) 
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2.5 MATLAB EXAMPLES 1 


Example 2.1: Synthesis of periodic signals and almost periodic signals 

(see Section 2.2) 


Consider Equation (2.2) for this example, i.e. 

x(t) = A \ sin(2jrpir + 6 \) + A 2 sin ( 2 n pit + 82 ) 

Let the amplitudes Ai = A 2 = 1 and phases 61=62 = 0 for convenience. 


Case 1: 

Periodic signal with frequencies p\ = 1.4 Hz and pi = L5 Hz. 

Note that the ratio pi/ pi is rational, and the smallest common period of both 
1 / p 1 and 1 /p 2 is ‘ 1 O’ , thus the period is 1 0 seconds in this case. 

Line 

MATLAB code 

Comments 

1 

clear all 

Removes all local and global variables 
(this is a good way to start a new 
MATLAB script). 

2 

Al=l; A2=l; Thetal=0; 
Theta2=0; pl=1.4; p2=1.5; 

Define the parameters for Equation 
(2.2). Semicolon (;) separates 
statements and prevents displaying the 
results on the screen. 

3 

t=[0:0.01:30]; 

The time variable t is defined as a row 
vector from zero to 30 seconds with a 
step size 0.01. 

4 

x=Al*sin(2*pi*pl*t+Thetal) 

+A2*sin(2*pi*p2*t+Theta2); 

MATLAB expression of Equation 
(2.2). 

5 

plot(t, x) 

Plot the results of t versus x (t on 
abscissa and x on ordinate). 

6 

xlabel('\itt\rm (seconds)'); 
ylabel('\itx\rm(\itt\rm)') 

Add text on the horizontal (xlabel) and 
on the vertical (ylabel) axes. ‘\it’ is for 
italic font, and ‘\rnT is for normal 
font. Readers may find more ways of 
dealing with graphics in the section 
‘Handle Graphics Objects’ in the 
MATLAB Help window. 

7 

grid on 

Add grid lines on the current figure. 


1 MATLAB codes (m files) and data files can be downloaded from the Companion Website (www.wiley.com/go/ 
shin_hammond). 
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Results 



Comments: It is clear that this signal is periodic and repeats every 10 seconds, i.e. 
Tp — 10 seconds, thus the fundamental frequency is 0.1 Hz. The frequency domain 
representation of the above signal is shown in Figure (b). Note that the amplitude of 
the fundamental frequency is zero and thus does not appear in the figure. This ap- 
plies to subsequent harmonics until 1.4 Hz and 1.5 Hz. Note also that the frequency 
components 1.4 Hz and 1.5 Hz are ‘harmonically’ related, i.e. both are multiples of 
0.1 Hz. 



(b) Fourier transform ofx(i) (periodic) 
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Case 2: 

Almost periodic signal with frequencies pi — \fl Hz and p 2 = 1.5 Hz. 

Note that the ratio pi/p 2 is now irrational, so there is no common period of both 
1 !p\ and 1 / p 2 - 

Line 

MATLAB code 

Comments 

i 

clear all 


2 

Al=l; A2=l; Thetal=0; 


3 

Theta2=0; pl=sqrt(2); p2=1.5; 
t— [0:0.01:30]; 

Exactly the same script as in the previous 

4 

x=Al*sin(2*pi*pl* t+Theta 1 ) 

case except ‘pi =1.4’ is replaced with 
'pl=sqrt(2)’. 


+A2*sin(2*pi*p2*t+Theta2); 

5 

plot(t, x) 


6 

xlabel('\itt\rm (seconds)'); 
ylabel('\itx\rm(\itt\rm)') 


7 

grid on 



Results 



(a) Almost periodic signal 


Comments: One can find that this signal is not periodic if it is observed carefully by 
closely investigating or magnifying appropriate regions. The frequency domain repre- 
sentation of the above signal is shown in Figure (b). Since the signal is not exactly 
periodic, the usual concept of the fundamental frequency does not hold. However, it may 
be considered that the periodicity of this signal is infinite, i.e. the fundamental frequency 
is ‘0 Hz’ (this concept leads us to the Fourier integral which is discussed in Chapter 4). 
The spread of the frequency components in the figure is not representative of the true 
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frequency components in the signal, but results from the truncation of the signal, i.e. it 
is a windowing effect (see Sections 3.6 and 4.11 for details). 


£ 
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Frequency (Hz) 


(b) Fourier transform of x(t) (almost periodic) 


3 

Fourier Series 


Introduction 

This chapter describes the simplest of the signal types - periodic signals. It begins with the 
ideal situation and the basis of Fourier decomposition, and then, through illustrative examples, 
discusses some of the practical issues that arise. The delta function is introduced, which is 
very useful in signal processing. The chapter concludes with some examples based on the 
MATLAB software environment. 

The presentation is reasonably detailed, but to assist the reader in skipping through to 
find the main points being made, some equations and text are highlighted. 


3.1 PERIODIC SIGNALS AND FOURIER SERIES 

Periodic signals are analysed using Fourier series. The basis of Fourier analysis of a 
periodic signal is the representation of such a signal by adding together sine and cosine 
functions of appropriate frequencies, amplitudes and relative phases. For a single sine 
wave 


x(t) = X sin (cot + (/>) = X sin (2nft + cj>) 


where X 


f 

<P 


is amplitude, 

is circular (angular) frequency in radians per unit time (rad/s), 
is (cyclical) frequency in cycles per unit time (Hz), 
is phase angle with respect to the time origin in radians. 


(3.1) 


The period of this sine wave is T P = 1// = 2jr/a> seconds. A positive phase angle cp 
shifts the waveform to the left (a lead or advance) and a negative phase angle to the right 
(a lag or delay), where the time shift is cp/co seconds. When cp = tc/2 the wave becomes a 
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cosine wave. The Fourier series (for periodic signal) is now described. A periodic signal, x(t), 
is shown in Figure 3.1 and satisfies 

x(t) = x(t + nT P ) n = ±l, ±2, ±3, ... (3.2) 



Figure 3.1 A period signal with a period 7). 


With a few exceptions such periodic functions may be represented by 


<o = ? + E 


( 2nnt\ 

‘wT ) 


+ b n sin 




(3.3) 


The fundamental frequency is f\ = l/7> and all other frequencies are multiples 
of this, ao/2 is the d.c. level or mean value of the signal. The reason for wanting to use 
a representation of the form (3.3) is because it is useful to decompose a ‘complicated’ 
signal into a sum of ‘simpler’ signals - in this case, sine and cosine waves. The amplitude 
and phase of each component can be obtained from the coefficients a n , b n , as we shall see 
later in Equation (3.12). These coefficients are calculated from the following expressions: 


ao 

2 


a n 




n = 1, 2, ... (3.4) 
n = 1, 2, ... 


We justify the expressions (3.4) for the coefficients a n , b„ as follows. Suppose we wish to 
addupasetof ‘elementary’ functions n„(f), n = 1,2, ..., so as to represent a function x(t), i.e. 
we want c„u„(t) to be a ‘good’ representation of x(t). We may write x(t) & c„u n (t), 
where c„ are coefficients to be found. Note that we cannot assume equality in this expression. 
To find the coefficients c„, we form an error function e(t) — x(t) — c n u„(t) and select the 
c„ so as to minimize some function of e(t), e.g. 
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Since the u n ( t ) are chosen functions, y is a function of ci , c 2 , . . . only, so in order to minimize 
J we need necessary conditions as below: 

dJ 


dc m 


= 0 for m = 1,2,... 


(3.6) 


The function J is 


J(c i, c 2 

and so Equation (3.6) becomes 

Tp 


•••) = J (*(0-X> B ( 


(0 * 


ay 

d c m 


= J 2(x(t) - c n u n {tyj(-u m (t))dt = 0 (3.7) 


Thus the following result is obtained: 

Tp 

x{t)u m {t)dt = 


J X(t)u m (t)dt = j M n (t)w m (t)dt 
0 " 0 

At this point we can see that a very desirable property of the ‘basis set’ u„ is that 

Tp 

j u„(t)u m (t)dt = 0 for n^m 


(3.8) 


(3.9) 


i.e. they should be ‘orthogonal’. 

Assuming this is so, then using the orthogonal property of Equation (3.9) gives the 
required coefficients as 


/ x(t)u m (t)dt 
o 

f u 2 (t)dt 
o 


(3.10) 


Equation (3.10) is the equivalent of Equation (3.4) for the particular case of selecting sines 
and cosines as the basic elements. Specifically Equation (3.4) utilizes the following results of 
orthogonality: 

Tp 


f (2nmt\ {2nnt\ 

J cos I — — I sin ^ — — I dt = 0 for all . 


T i /2nmt\ ( 27tnt\ 

J cos I I cos I \dt = 0 

o V Tp J \ Tp J 

(2 nmt\ ( 2nnt\ 

f sin ( ) sin ( Jdt = 0 


for m ^4 n 


(3.11) 
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We began by referring to the amplitude and phase of the components of a Fourier 
series. This is made explicit now by rewriting Equation (3.3) in the form 

OO 

x{t ) = — + ^2 cos (2nnf l t + 0„) (3.12) 

2 n = t 

where f\ = 1/ TV is the fundamental frequency, 

M„ = yja 2 + b 2 are the amplitudes of frequencies at nf \ , 

<p n — tan -1 (— b„/a„ ) are the phases of the frequency components at nf\ . 


Note that we have assumed that the summation of the components does indeed accurately 
represent the signal x(t), i.e. we have tacitly assumed the sum converges, and furthermore 
converges to the signal. This is discussed further in what follows. 


An Example (A Square Wave) 


As an example, let us find the Fourier series of the function defined by 


x(t) = — 1 < t < 0 

2 


and x(t + nT) = x(t) n = ±1, ±2, 


= 1 0 < t < — 

2 


where the function can be drawn as in Figure 3.2. 



x(t) 







r"" 1 

2 

o 

T 

2 







Figure 3.2 A periodic square wave signal 


(3.13) 


From Figure 3.2, it is apparent that the mean value is zero, so 

T/2 

flo 1 

— = — / x(t)dt = 0 (mean value) 


-u 

-T/2 


(3.14) 
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and the coefficients a n and b n are 

772 



-r /2 




0 

r /2 

2 

f (2nnC 

f / 2n nt\ 

— 

/ — sin 

1 dt + I sin 1 1 dt 

T 

J V T , 

J V T ) 


-r /2 

0 


— ( 1 — cos njr) 
nit 


(3.15) 


So Equation (3.13) can be written as 


x(t) = 


4 

71 






(3.16) 


We should have anticipated that only a sine wave series is necessary. This follows from 
the fact that the square wave is an ‘odd’ function and so does not require the cosine terms 
which are ‘even’ (even and odd functions will be commented upon later in this section). 

Let us look at the way the successive terms on the right hand side of Equation (3.16) 
affect the representation. Let u>\ = 2nfi = 2n /T , so that 


4 

x(t) - — 
n 


1 1 

sincuif + - sin3a>if + sin5cuil + ■ 


(3.17) 


Consider ‘partial sums’ of the series above and their approximation to x(t), i.e. denoted 
by S n (t), the sum of n terms, as in Figure 3.3: 



MA 




T 

~2 

0 

w 

T 

2 

t 

w 


1 Term 

4 

<S' 1 (/) = — smo) x t 
n 

Figure 3.3 Partial sums of the Fourier series of the square wave 
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Figure 3.4 The coefficients b„ of the Fourier series of the square wave 


Note the behaviour of the partial sums near points of discontinuity displaying what is 
known as the ‘overshoot’ or Gibbs' phenomenon , which will be discussed shortly. The above 
Fourier series can be represented in the frequency domain, where it appears as a line spectrum 
as shown in Figure 3.4. 

We now note some aspects of Fourier series. 


Convergence of the Fourier Series 

We have assumed that a periodic function may be represented by a Fourier series. Now 
we state (without proof) the conditions (known as the Dirichlet conditions, see Oppenheim 
et al. (1997) for more details) under which a Fourier series representation is possible. The 
sufficient conditions are follows. If a bounded periodic function with period T P is piecewise 
continuous (with a finite number of maxima, minima and discontinuities) in the interval 
— Tp/2 < t < Tp/2 and has a left and right hand derivative at each point to in that interval, 
then its Fourier series converges. The sum is x{tf) if x is continuous at tg.lfx is not continuous 
at to, then the sum is the average of the left and right hand limits of x at to. In the square wave 
example above, at t = 0 the Fourier series converges to 


1 f 

- lim x (t) + lim x(t) 

2 (->o+ f-*o- 


2 [1 - 1 ] = ° 


Gibbs’ Phenomenon ™ 1 

When a function is approximated by a partial sum of a Fourier series, there will be a 
significant error in the vicinity of a discontinuity, no matter how many terms are used for 
the partial sum. This is known as Gibbs’ phenomenon. 

Consider the square wave in the previous example. As illustrated in Figure 3.5, near 
the discontinuity the continuous terms of the series struggle to simulate a sudden jump. 
As the number of terms in the partial sum is increased, the ‘ripples’ are squashed towards 
the point of the discontinuity, but the overshoot does not reduce to zero. In this example it 




PERIODIC SIGNALS AND FOURIER SERIES 


37 


turns out that a lower bound on the overshoot is about 9% of the height of the discontinuity 

(Oppenheim et al., 1997). 

__ Overshoot 


7 Terms 

20 Terms 

Figure 3.5 Illustrations of Gibbs’ phenomenon 


Differentiation and Integration of Fourier Series 

If x(t) satisfies the Dirichlet conditions, then its Fourier series may be integrated term by 
term. Integration ‘smoothes’ jumps and so results in a series whose convergence is enhanced. 
Satisfaction of the Dirichlet conditions by x(t) does not justify differentiation term by term. 
But, if periodic x(t) is continuous and its derivative, x{t), satisfies the Dirichlet conditions, 
then the Fourier series of x(t) may be obtained from the Fourier series of x(t) by differentiating 
term by term. Note, however, that these are general guidelines only. Each situation should be 
considered carefully. For example, the integral of a periodic function for which «o 7 ^ 0 (mean 
value of the signal is not zero) is no longer periodic. 


Even and Odd Functions 

A function x(t) is even if x(t) — x(—t), as shown for example in Figure 3.6. 

A function x(t) is odd if x(t) = —x(—t), as shown for example in Figure 3.7. 

Any function x(t) may be expressed as the sum of even and odd functions, i.e. 

x(t) = ^ [x{t) + x(-t )] + ^ [x(t) - x(-t)\ = x e (t) + x a {t) (3.18) 

If x(t) and y(t) are two functions, then the following four properties hold: 

1. If x(t) is odd and y(t) is odd, then x(t)-y(t) is even. 

2. If x(t) is odd and y{t) is even, then x{t)-y{f) is odd. 

3. If x(t) is even and y{t) is odd, then x(t)-y(t) is odd. 

4. If x(t) is even and y(t) is even, then x(t)-y(t) is even. 



t 


Figure 3.6 An example of an even function 
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Figure 3.7 An example of an odd function 


Also: 

1. If x(t) is odd, then j a _ a x(t)dt = 0. 

2. If x(t) is even, then f a a x(t)dt = 2 / 0 “ x(t)dt. 


Fourier Series of Odd and Even Functions 
If jc(0 is an odd periodic function with period 7>, then 

OO 

x(t) = £ 

n=l 

It is a series of odd terms only with a zero mean value, i.e. a n — 0, n = 0, 1,2, If x{t) is 

an even periodic function with period T P , then 

*« = ? + £ 

Z n = 1 

It is now a series of even terms only, i.e. b n = 0, n = 1, 2, 

We now have a short ‘mathematical aside' to introduce the delta function, which turns 
out to be very convenient in signal analysis. 


/ 2nnt\ 

; UrJj 



3.2 THE DELTA FUNCTION 

is sometimes called the unit impulse 

OO 

J S(t)dt = 1 (3.19) 

— OO 

This is not an ordinary function, but is classified as a ‘generalized’ function. We may 
consider this function as a very narrow and tall spike at t = 0 as illustrated in Figure 3.8. 
Then, Figure 3.8 can be expressed by Equation (3.20), where the integration of the 
function is f 00 ^ S e (t)dt = 1. This is a unit impulse'. 


The Dirac delta function is denoted by S(t), and 
function. Mathematically, it is defined by 

S(t) = 0 for t 0, and 


5,(0 = 


1 

E 


for 


£ 

< t < 


2 


E 

2 


= 0 


(3.20) 


otherwise 
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1 
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. 





-e/2 e/2 


t 


Figure 3.8 Model of a unit impulse 


Now, if we visualize the spike as infinitesimally narrow, i.e. S(t) = lim^o S e (t), 
then we may represent it on a graph as shown in Figure 3.9, i.e. an arrow whose height 
indicates the magnitude of the impulse. 


1.0 


Figure 3.9 Graphical representation of the delta function 


An alternative interpretation of the unit impulse is to express it in terms of the unit step 
function u(t) defined as 


u(t) = 1 for t > 0 
= 0 for t < 0 


(3.21) 


Since Equation (3.20) can be obtained by using two unit step functions appropriately, i.e. 
<5 s (f ) = (1/s) [u ( t + s/2) — u(t — s/2)], the Dirac delta function and the unit step function 
have the following relationship, which is the Dirac delta function as the derivative of the unit 
step function: 

d 

<5(t) = lint S e (t) = —u(t) (3.22) 

e-»o dt 

Note that the concept of Equation (3.22) makes it possible to deal with differentiating functions 
that contain discontinuities. 


Properties of the Delta Function 

A shifted delta function: if a delta function is located at t = a, then it can be written as 
S(t — a). Some useful properties of the delta function are: 

1. <5(0 = <5(— 0, i.e. a delta function is an even function. 

2. Sifting property: if x(t) is an ‘ordinary’ function, then the integral of the product of the 
ordinary function and a shifted delta function is 

OO 

J x(t)S(t - a)dt = x(a) (3.23) 

— OO 

i.e. the delta function ‘sifts out’ the value of the ordinary function at t = a. 
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The result (3.23) is justified in Figure 3.10 which shows the product of an ordinary function 
and a shifted <5 s (f), i.e. 


Is 


OO 

J x(t)S e (t - a)dt 

— OO 


(3.24) 



We evaluate this integral and then let e — > 0 to obtain Equation (3.23), i.e. 


a+e/2 

h = j J x(t)dt 

a— e /2 


(3.25) 


This is the average height of x(t) within the range a — e/2 < t < a + e/2 and we write this 
as x(a + de/2), \9\ < 1. So I e = x(a + 9e/2), from which lim^o Is — x(a). This justifies 
Equation (3.23). 


3. 


OO 

J e ±j2na, dt = S(a), 

— OO 


or 


OO 

J e ±jat dt = 2jr<5(a) 

— OO 


(3.26) 


The justification of this property is given below, where the delta function is described in terms 
of the limiting form of a tall narrow ‘sine function" as shown in Figure 3.11: 

oo M M 


J 


e ±i^ a >dt = lim [ 

M— >oo J 


(cos2nat =b / s'm2nat)dt = lim 

M— >oo 


-M 


I 


( cos2nat)dt 


= lim 2 

M—>oo 


sin 2 nat 


2na 


= lim 2 M- 


_ sin 2naM 
2naM 


= S(a) 


(3.27) 


Note that it can be verified that the integral of the function in Figure 3.11 is unity (see 
Appendix A). 


1 

4. S(at) = — <5(f), where a is an arbitrary constant (3.28) 

|a| 

OO 

J f(t)S in) (t — a)dt = (— I ) n f <n \a), where ( n ) denotes the nth derivative 

— OO 


5. 


(3.29) 
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Figure 3.11 Representation of the delta function using a sine function 


3.3 FOURIER SERIES AND THE DELTA FUNCTION 


As already noted, we can differentiate discontinuous functions if we introduce delta functions. 
Let us consider the Fourier series of derivatives of discontinuous periodic functions. Consider 
an example of a discontinuous function x(t) as shown in Figure 3.12, whose Fourier series is 
given as 


x(t) = 



(Note that this is an odd function offset with a d.c. component.) 
Differentiating the function in the figure gives 


j OO 

n =— oo 


and differentiating the Fourier series term by term gives 

2 ^ (2n nt\ 

n = 1 x 7 

Equating these gives 


OO | rx OO 

E *('-»d = ^ + |E‘ 


2nnt\ 


(3.30) 


This is a periodic ‘train’ of impulses, and it has a Fourier series representation whose coef- 
ficients are constant (2/T) for all frequencies except for the d.c. component. The periodic 
train of impulses (usually written S T (t) or i(t)) is drawn as in Figure 3.13, and will be used in 
sampling theory later. 



*(0 = l-- 


0 <t<T, and 


x(t + nT) = x(t ) n=±l, +2, ... 


t 


Figure 3.12 An example of a discontinuous periodic function 
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••• t t r 


[ft-. 


T 



T 

Figure 3.13 The periodic train of impulses 


So far we have used sines and cosines explicitly in the Fourier series. These functions 
can be combined using e ±j0 = cos 6 ± j sin 9 and so sines and cosines can be replaced by 
complex exponentials. This is the basis of the complex form of the Fourier series. 


3.4 THE COMPLEX FORM OF THE FOURIER SERIES 


It is often convenient to express a Fourier series in terms of e ±ja>,n, (a)i = 2n/T p ). Note 
that 

cos 8 — — ( e je + e~ je ) and sinf? = £ ( e ie — e~ ie ) 


2 j 


so the Fourier series defined in Equation (3.3) 


« = ? + £ 


/ 2jcnt\ /2nnt\ 

a„ cos ( — — I + b„ sin ( — — ) 


V 7] 


V T P ) \ 


becomes 
x(t) = 


flo 

2 

ao 

~2 


+ E 


a n 


jlnntlTe _|_ e -j2nnt/T P \ _| n_ f p j2nnt/T P _ p -j2nnt!T P \ 






~l ]e 


-jlnnt/Tp 


ao 

~2 


0,1 j^n c i2 nnt /T P , a " + J^ii „-i2nnt/T P 

2 2 


+ ^ c j2xn,/T P + Y 

n = 1 2 n= 1 

Let c 0 = a 0 / 2, c„ = (a n - jb„)/ 2, so c* = ( a n + jb„)/2, i.e. 


x(t) = C0 + Yc n e^rr,. + Y 

n= 1 n= 1 

Substituting for a„ and b„ from Equation (3.4) gives 

t p t, 

CO = — f x(t)dt , Cn = — 

0 


c*e~i 2nn, l Tp 


(3.31) 


(3.32) 


ip 

- rj 

0 


„ = [ X(t)e -J 2 -' dt, c* n =±f x(t)e j27rn, ' TF dt = c— n 

iP J iP J 

0 

(3.33) 
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Note that the negative frequency terms (c_ n ) are introduced, so that 


E 


* -j2nnt/Tp _ 

— 


n=\ 


oo 

Y jC - n e- j2,zn,/Tp 

n = 1 


in Equation (3.32). Thus, we obtain the very important results: 


oo 

x(t)= Y Cn e i2nn,ITp 

n =— oo 


(3.34) 


T P 

Cn = ^ f x(t)e- j2 * n,/T >'dt 
Ip J 
0 


(3.35) 


Note that the ‘basic elements’ are now complex exponentials and the Fourier coefficients c„ are 
also complex, representing both amplitude and phase information. Note also that the notion 
of ‘negative frequencies’, i.e. f, = n/Tp, n = 0, ±1, ±2, . . . , has been introduced by the 
algebraic manipulation in Equation (3.33). 

Referring to Equation (3.12) 


ci o % "\ 

X(t) = — + Y M » COS ( 2nn fd + <t>n) 


the relationship between the coefficients is given by 
Cn = jb " S0 |Cnl = \^ a n+ b2 n = 


and arg c„ = tan 1 ( — - | = < 


for n/ 0 


(3.36) 


All previous discussions on Fourier series still hold except that now manipulations are 
considerably easier using the complex form. We note a generalization of the concept of 
orthogonality of functions for the complex case. Complex-valued functions u„ ( t ) are orthogonal 
if 


Tp 

J u n (t)u* m (t)dt = 0 for m^n (3.37) 

o 

This is easily verified by using u„(t) = e i27ln, l Tr . Also, when n — m, the integral is Tp. 


3.5 SPECTRA 

We now introduce the notion of the spectrum of a process. We shall refer to the com- 
plex representation of Equation (3.34) using positive and negative frequencies and also 
Equation (3.12) using only positive frequencies. Referring to Equation (3.34) first, a plot 
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of the magnitude |c„| versus frequency /(or to) is called the amplitude spectrum of the 
periodic function x(t). A plot of the phase angle arg c„ versus frequency is called the 
phase spectrum. These are not continuous curves but take values only at discrete values 
for f = n/T P , n = 0, ±1, ±2, .... We can draw these spectra as in Figures 3.14 and 
3.15 respectively. 



Figure 3.14 Amplitude spectrum of a Fourier series (a line spectrum and an even function) 


argc„ 


T 1 


1 

: 



i 


i 

1 


Figure 3.15 Phase spectrum of a Fourier series (a line spectrum and an odd function) 


If we did not want to include negative frequencies, we could plot M n , 0 n (Equation (3.12)) 
versus n above (note that M„ = 2 |c„| forn ^ 0). 

As an example, consider a periodic function that can be depicted as in Figure 3.16. A 
calculation will give the coefficients as 


c 


n 


( Ad/T ) sin (nnd/T) 
(nnd/T) 


, argc„ = 0 


Since this function is even the phase spectrum is zero for all frequency components. If, 
for example, T = 1/4, d = 1 /20, then the amplitude spectrum is as given in Figure 3.17. 

If the function is shifted to the right by <7/2, then the function is depicted as in 
Figure 3.18. Then, |c„| is unchanged but the phase components are changed, so that 
argc„ = — nn(d/T) rads. 


x(t) 
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Figure 3.16 A periodic function of a rectangular pulse 
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Figure 3.17 Amplitude spectrum of the function given in Figure 3.16 
x(t) 
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Figure 3.18 A periodic function of a rectangular pulse shifted to the right by d/2 


Parseval’s Theorem - The Power Spectrum 

Suppose x(t) is interpreted as a voltage. Then the instantaneous power dissipated across 
a 1 ohm resistor is j r 2 !?), and the average power dissipated across the resistor is 


ip 

i 


jr I x\t)dt 
0 


Now, using Equation (3.34), 

oo 

x(t)= £ c n e> 2nntlTr 

tl — — OO 

the voltage squared is 

OO OO 

x\t) = x(t) ■ x\t) = £ c n e j27rn,/TF ■ £ 

n =— oo m=—oc 

Thus, the average power can be written as 


1 f , 1 ^ r 

J x (t)dt = - £ £ c„c m y 

• v r n = — oo m=—m ” 


e j2n(n-m)t/T Pdt (3.38) 
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By the property of orthogonality, this is reduced to the following form known as (one 
form of) Parseval’s theorem : 

t p 

1 C 00 

— / x\t)dt= £ |c„| 2 (3.39) 

P •j' n = — oo 

This has a ‘physical’ interpretation. It indicates that the average power of the signal x(t ) 
may be regarded as the sum of power associated with individual frequency components. 
The power spectrum is \c„\ 2 and may be drawn (typically) as in Figure 3.19. It is a 
decomposition of the power of the process over frequency. Note that the power spectrum 
is real valued and even (and there is no phase information). 



T p T p T r T p 

Figure 3.19 An example of a power spectrum (Compare with Figure 3.14) 


If we wished to restrict ourselves to positive frequencies only, we could fold the left 
hand portion over to double the values at frequencies f — n/Tp, n = 1, 2, .... The name 
‘periodogram’ is sometimes given to this power decomposition. 


3.6 SOME COMPUTATIONAL CONSIDERATIONS^ 2 

When calculating the Fourier coefficients of a periodic signal which we measure , it is important 
to be able to identify the period of the signal. For example, if x(t) has the form shown in 
Figure 3.20, the period is T P . 



Figure 3.20 A periodic signal with a period 7) 
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If we acquire (record or measure) exactly one period, we can calculate Fourier coefficients 
correctly (subject to ‘computational errors’) from the formula given by Equation (3.35), i.e. 


c n 


1 

Tp 


0 


If we acquire rT P seconds of data and use the formula 


rTp 

cn^f x(t)e~ j2nnt,Tp dt 
rT P J 
o 


(3.40) 


then if r is an integer we can obtain the same Fourier coefficients spaced at frequencies 1 /T P 
along the frequency axis. 

However, if r is not an integer and we use the formula 


rTp 

c„ = [ x(t)e lw r‘dt (3.41) 

rTp J 
0 

then the Fourier coefficients need careful consideration. For example, if we use the period of 
1.57> (note that ris no longer an integer), then we are assuming that the signal is as shown in 
Figure 3.21 (compare this with the true signal in Figure 3.20). 

Clearly, the Fourier coefficients change (we see immediately that there is a non-zero 
mean value) and frequencies are present at every 1/ 1.57V Hz. 

In practice, if the period is not known, then it is necessary to capture a large number of 
periods so that ‘end effects’ are small. This point should be noted since computational methods 
of obtaining Fourier coefficients often restrict the data set of N points where N = 2 M , i.e. a 
power of two ( M is an integer). This means we may analyse a non-integer number of periods. 
These features are now demonstrated for a square wave. 

For the square wave shown in Figure 3.20 (a period of T P ), the theoretical Fourier 
coefficient c„ has magnitude 

2 

|c„| = — for n = odd 
njr 

= 0 for n = 0, even 


(3.42) 
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Figure 3.21 A periodic signal with a period of 1 .57>. 
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If Tp = 1 second, frequency components | c n | occur at every l/T P = 1 Hz. Figure 3.22(a) 
shows |c„ | plotted (for n > 0 only) up to 10 Hz for an exact single period using Equation (3.35) 
(or using Equation (3.41) with r= 1). Figure 3.22(b) shows x(t) for two periods using Equation 
(3.41 ), where r = 2. By comparing these two figures, it can be seen that the Fourier coefficients 
are exactly the same except that Figure 3.22(b) has more ‘zeros’ due to the fact that we calculate 
the coefficients at every 1 /27V = 0.5 Hz. 

Figures 3.22(c) and (d) show changes in the amplitude spectra when a non-integer number 
of periods is taken (i.e. non-integer r) and Equation (3.4 1) is used. In Figure 3 .22(c), 1 .5 periods 
are used. Note that it appears that there are no frequency components at 1, 3, 5, . . . Hz as 
indicated in Figures 3.22(a) and (b). For Figure 3.22(d), 3.5 periods are taken. The increased 
‘density’ of the line frequencies shows maxima near the true values (also refer to Figure 3 .22(e) 
where 10.5 periods are taken). Note that in Figures 3.22(c)-(e) the amplitudes have decreased. 
This follows since there is an increased density of frequency components in the decomposition. 
Recall Parseval’s identity (theorem) 


If °° 

— / x 2 (t)dt = \c„\ 2 

r J n =— oo 

and note the 1 /T P on the left hand side. For the square wave considered, the average power is 
always unity when an integer number of periods is included. When a non-integer number of 
periods is included the increased density of frequency components means that the amplitudes 
change. 

Some Comments on the Computation of Fourier Coefficients 

This is an introductory comment on the computation of Fourier coefficients, which will be 
expanded later in Chapter 6. We address the problem of performing the following integral 
using digital techniques: 


Tp 

c k = Y P j x{t)e j27lk>,T, ‘ dt 

o 


(3.43) 


Consider an arbitrary signal measured for T P seconds as shown in Figure 3.23. Suppose the 
signal is sliced as shown in the figure, and the values of x(t) at N discrete points x(n A), n = 0, 
1,2,..., N— 1, are known, each point separated by a time interval A, say. Then a logical and 
simple approximation to the integral of Equation (3.43) is 


or 


Ck 


1 

NA 


N - 1 

• 2nknA 

x(n A)e J NA 

n= 0 


• A 


(3.44) 


Ck 


, JV— 1 

1 ^ — \ _ • In kn 

— x(n A)e J N 

^ n—0 


Xk 

N 


(say) 


(3.45) 


X(t) 

1.0 


1.0 


t{ s) 



(a) Computed with a period Tp (r — 1), using Equation (3.41) 




(b) Computed with a period 2Tp (r = 2), using Equation (3.41) 




(c) Computed with a period \.5Tp ( r = 1.5), using Equation (3.41) 




(d) Computed with a period 3.5Tp ( r = 3.5), using Equation (3.41) 




(e) Computed with a period 10.57p (r = 10.5), using Equation (3.41) 


Figure 3.22 Fourier coefficients of a square wave 
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Figure 3.23 An arbitrary signal measured for T P seconds 


The expression Xk on the right hand side of Equation (3.45) is called a discrete Fourier 
transform (DFT) and we note some important points that will be further discussed later: 

1 . It is a finite summation. 

2. Do not assume X k /N = c k - In fact X k turns out to be periodic (proved below), i.e. X k = 
Xk+rN, where r is an integer, though it would seem reasonable to expect that, if A is 
‘sufficiently small’, then c k ^ Xk/N for at least some ‘useful" range of values for k. 

Proof of the periodic nature of the DFT : From 

X k = Y i x n e- J " nk 

n = 0 

(note the change in notation x n = x(n A)), substitute kby k + rN (r is an integer). Then the 
equation becomes 

Xk+rN = Y / x n e-^ n(k+rN) 

n = 0 

and thus Xk+ r N = Xk- 

3. The DFT relationship 


= J2x n e- J % nk e^l 


x * = E- 


-jfnk 


(3.46) 


has the inverse relationship (IDFT) 


X r j 


1 

N 


N - 1 

J2 x k e j2 » nk 

k—0 


(3.47) 


So, although X k may not provide enough information to allow the continuous time series 
x(t ) to be obtained, it is important to realize that it does permit the discrete values of the 
series x„ to be regained exactly. 








SPECTRA 
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Proof of the inverse DFT (IDFT) relationship: Starting from 


X k = Y,Xne~ J>k 

n = 0 

multiply both sides by e^ 2n ^ N ^ sk and sum over k (s is an integer, 0 < s < N — 1). This 
yields 


N - 1 

Y^X k e jl ^ sk 


k = o 


N- 1 N - 1 

Y Y X n e ~ j ^ nkej ^ Sk 


k = 0 n = 0 


N- 1 N - 1 

EE 


x n e * i %^ n)k 


k = 0 n=0 


V-l V-l 


E^E 

n=0 /t=0 




Consider the second summation. Let s — n = m (integer); then we get ^2k=o e^ mk . 

(a) If m = 0, this is N. 

(b) If m 0, this is a ‘geometric series’ with common ratio e 7 V m and the sum is 


Sn- i 


i - (V>y v 

1 - e J v m 


Thus 


N - 1 , N - 1 

y; Xke^ sk = Nx s , and so x s — — y Xke^ sk 

k = 0 ” £=0 

or more usually 


-Tn 


1 

N 


N - 1 


£=0 


4. Some authors have dehned the DFT in related but different ways, e.g. 



(3.48) 


Clearly such differences are ones of scale only. We shall use Equations (3.46) and (3.47) 
since these are widely adopted as ‘standard’ in signal processing. 

5. N is of course arbitrary above, but is often chosen to be a power of two (N = 2 M , M an 
integer) owing to the advent of efficient Fourier transform algorithms called fast Fourier 
transforms (FFTs). 

6. We have introduced the DFT as an approximation for calculating Fourier coefficients. 
However, we shall see that a formal body of theory has been constructed for ‘discrete- 
time’ systems in which the properties are exact and must be considered to stand on their 
own. Analogies with continuous-time theory are not always useful and in some cases are 
confusing. 
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3.7 BRIEF SUMMARY 


1. A periodic signal of period T P may be expressed (Equations (3.34) and (3.35)) by 

t p 

o° | „ 

x(t) = V c n e J2 *" ,/TF with c n = — x(t)e- j27ln,/TF dt 

t p J 


2. The plots of | r„ | versus frequency and arg c n versus frequency are amplitude and phase 
(line) spectra of the Fourier decomposition. 

3. The average power of a periodic signal is described by Equation (3.39), i.e. 


1 f °°' 

— / x 2 * * (t)dt = Y] \ c n\ 2 ParsevaVs theorem (identity) 

T p J 

r " n=— oo 

A plot of | c„ | 2 versus frequency is called a power spectrum (or a periodogram). 

4. The DFT and IDFT relationships for discrete data are defined by Equations (3.46) and 
(3.47), 


x k ~ E*" e 1 ^" k and x " = ff ^2x k e j2 Z nk 

n = 0 ™ k=0 

The Fourier coefficients c ^ are approximated by Ck & Xk/N if an integer number of 
periods is taken and only for a restricted range of k. 


We now include some MATLAB examples illustrating the material covered. 


3.8 MATLAB EXAMPLES 


Example 3.1: Illustration of the convergence of the Fourier series and Gibbs’ 
phenomenon (see Section 3.1) 

Consider Equation (3.17), 

1 1 

sinruit + - sin3a>ii + - sin5o;if + ■ ■ ■ 

In this MATLAB example, we compare the results of 3, 7 and 20 partial sums in 

Equation (3.17). 
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Line 

MATLAB code 

Comments 

i 

clear all 

Define the time variable (vector) t 

2 

t=[0:0.001:l]; 

from 0 to 1 second with a step size 
of 0.001. 

3 

x=[]; x_tmp=zeros(size(t)); 

Define an empty matrix x, and 
define the vector x_tmp having the 
same size as the vector t. All the 
elements of x_tmp are zeros. 

4 

for n= 1:2:39 

Start a ‘for’ loop where n are 1, 3, 
5, . . . , 39 (n = 39 implies the 20 par- 
tial sums). 

5 

x_tmp=x_tmp+4/pi*(l/n*sin(2*pi*n*t)); 

MATLAB expression of Equation 
(3.17), and the result of each partial 
sum is stored in the vector x_tmp. 

6 

x=[x; x_tmp]; 

Each row of matrix x has a 
corresponding partial sum of 
Equation (3.17). For example, the 
second row of x corresponds to the 
sum of two terms (i.e. n=3). 

7 

end 

End of the ‘for’ loop. 

8 

plot(t,x(3, :),t,x(7, : ),t,x(20, : )) 


9 

xlabel('\itt\rm (seconds)'); 

Plot the results of only 3, 7 and 20 


ylabel('\itx\rm(\itt\rm)') 

partial sums against the time 
variable. 

10 

grid on 



Results 



Comments: The square wave is better represented as the number of terms is increased, 
but its errors remain near the discontinuities. 
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Example 3.2: Fourier coefficients of a square wave (Figure 3.20, i.e. T P = I second 
and the amplitude is ‘1’) This is examined for various values of r in Equation (3.41) 
below (see Section 3.6): 

rT P 

1 f _ ; 2) rn . 

c„ = — / x{t)e lrT r dt 
rT P J 
o 

Case 1: r is an integer number. We chose r = 3 for this example; however, readers 
may choose any arbitrary positive integer number. The Fourier coefficients are 
calculated up to 10 FIz. 


Line 

MATLAB code 

Comments 

i 

clear all 

Define a parameter r for the number of 

2 

r=3;cn=[]; 

periods, and the empty matrix cn for the 
Fourier coefficients. 

3 

for n=l:10*r 

Define a ‘for’ loop for the Fourier 

4 

temp 1=0; temp2=0; 

coefficients up to 10 Hz, and set temporary 
variables. 

5 

for k = 1 :2:2*r 

This nested ‘for’ loop calculates the 

6 

tmp.odd = exp(-i*(k/r)*n*pi); 

integral in Equation (3.41) for the intervals 

7 

temp 1 =temp 1 +tmp_odd; 

of x(t) = 1 in Figure 3.20, and stores the 

8 

end 

result in the variable temp 1 . 

9 

for k = 2:2:2*r-l 

Another nested ‘for’ loop, which calculates 

10 

tmp.even = -exp(-i*(k/r)*n*pi); 

the integral in Equation (3.41) for the 

11 

temp2=temp2+tmp_even; 

intervals of x(t) = — 1 in Figure 3.20, and 

12 

end 

stores the result in the variable temp2. 

13 

temp = -1/2 + tempi + temp2 
-l/2*exp(-i*2*n*pi); 

This completes the calculation of the 
integral in Equation (3.41). 

14 

cn = [cn; i*temp/(pi*n)]; 

‘i*temp/(pi*n)’ is the final calculation of 
Equation (3.41) for each value of n. As a 
result, cn is a ‘30 x V vector, and each 
row of the vector cn contains the 
complex- valued Fourier coefficients. 

15 

end 

End of the ‘for’ loop. 

16 

stem([0:l/r:n/r],[0; abs(cn)], 'o', 'filled') 

Plot the result using the ‘stem’ command. 
[0:l/r:n/r] defines the frequencies 
(horizontal axis) from 0 Hz to 10 Hz at 
every 1/3 Hz. 

[0; abs(cn)] is the modulus of the Fourier 
coefficient at each frequency. Note that the 
value of zero is added for 0 Hz. 

The result is the amplitude spectrum. 

17 

xlabel('Frequency (Hz)') 

Insert labels for each axis. 

18 

ylabel('Modulus (\mid\itc_n\rm\mid)') 

‘\mid’ is for ‘|\ ‘\it’ is for italic font, ‘c_n’ 
is for c n , and ‘\rm’ is for normal font. 
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Results 



Comments: Compare this graph with Figures 3.22(a) and (b) and with the next case. 


Case 2: r is not an integer number. We chose r = 7.5 for this example; however, readers 
may choose any arbitrary positive integer number + 0.5. The Fourier coefficients 
are calculated up to 10 Flz. 


Line 

MATLAB code 

Comments 

i 

clear all 

‘ceil’ command rounds the element to the 

2 

r=7.5; r2=ceil(r); cn=[]; 

nearest integer towards infinity, so in this 
case, r2 has a value of 8. 

3 

for n=l:10*r 

Same as previous case. 

4 

templ=0; temp2=0; 


5 

for k = l:2:2*r2-3 

Except for ‘k= l:2:2*r2-3’, it is the same 

6 

tmp.odd = exp(-i*(k/r)*n*pi); 

script as in the previous case, i.e. it 

7 

tempi =temp 1 +tmp_odd; 

calculates the integral in Equation (3.41) 

8 

end 

for the intervals of x(t) = 1 . 

9 

for k = 2:2:2*r2-l 

Same script as in previous case, except for 

10 

tmp_even=-exp(-i*(k/r)*n*pi); 

‘k=2:2:2*r2-l\ It is for the intervals of 

11 

temp2=temp2+tmp_even; 

x(t)=- 1. 

12 

end 


13 

temp=-l/2 + tempi + temp2 

This completes the calculation of the 


+ l/2*exp(-i*(2*r/r)*n*pi); 

integral in Equation (3.41). 

14 

cn = [cn; i*temp/(pi*n)]; 

Same as in previous case, but now cn is a 
‘75 x 1’ vector. 

15 

end 

End of the ‘for’ loop. 
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16 


17 

18 


stem([0:l/r:n/r],[0.5/r; abs(cn)], 'o', 'filled') Frequencies are from 0 to 10 Hz at every 

1/7.5 Hz. 

0.5/r is added for 0 Hz (note the non-zero 
mean value). 

xlabel('Frequency (Hz)') Same as in previous case. 

ylabel('Modulus (\mid\itc_n\rm\mid)') 


Results 



0 1 1,1 1 1 i munM^n 

0123456789 10 

Frequency (Hz) 


Comments: Compare this graph with Figures 3.22(c)-(e) and with the previous case. 
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Fourier Integrals (Fourier Transform) 
and Continuous-Time Linear Systems 


Introduction 

This chapter introduces the central concept for signal representation, the Fourier integral. 
All classes of signals may be accommodated from this as a starting point - periodic, almost 
periodic, transient and random - though each relies on a rather different perspective and 
interpretation. 

In addition, the concept of convolution is introduced which allows us to describe linear 
filtering and interpret the effect of data truncation (windowing). We begin with a derivation 
of the Fourier integral. 


4.1 THE FOURIER INTEGRAL 


We shall extend Fourier analysis to non-periodic phenomena. The basic change in the 
representation is that the discrete summation of the Fourier series becomes a continuous 
summation, i.e. an integral form. To demonstrate this change, we begin with the Fourier 
series representation of a periodic signal as in Equation (4.1), where the interval of 
integration is defined from — T P /2toT P /2 for convenience: 


OO 

x(t) = ^ c n e i2nn, l Tr where c„ 

n =— oo 


l 

T~p 


A/2 

j x(t)e- i2nn,/TF dt 


-A/2 


(4.1) 


As an example, visualize a periodic signal x(t) as having the form below, in Equation 
(4.2) and Figure 4. 1 : 


x(t) = 0 -T P /2<t <-l 

= 1 — 1 < f < 1 (7>/2>l) (4.2) 

= 0 1 < t < T P /2 


Fundamentals of Signal Processing for Sound and Vibration Engineers 
K. Shin and J. K. Hammond. © 2008 John Wiley & Sons, Ltd 
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x(t) 


1.0 


1 1 
1 1 




-T p /2 “I 

1 Tp/2 


Figure 4.1 An example of a periodic signal with a period T P 


Now let T P become large. As this happens we are left with a single ‘pulse’ near 
/ = 0 and the others get further and further away. We examine what happens to the Fourier 
representation under these conditions. 

The fundamental frequency f\ = l/T P becomes smaller and smaller and all other 
frequencies (nf\ = /„, say), being multiples of the fundamental frequency, are more 
densely packed on the frequency axis. Their separation is 1 /T P = A / (say). So, as T P — >• 
oo, A / — > 0, i.e. the integral form of c„ in Equation (4.1) becomes 


A/2 


C=L [ x(t)e- j2 * n,/TF dt 
ip J 


-A/2 


A/2 

c„ = lim A / / 

^ -A/2 


x(t)e 


-j2nf„t 


dt 


(4.3) 


If the integral is finite the Fourier coefficients c„ — > 0 as A / — > 0 (i.e. the more frequency 
components there are, the smaller are their amplitudes). To avoid this rather unhelpful 
result, it is more desirable to form the ratio c n / Af , and so it can be rewritten as 

A/2 

lim ( — — ) = lim I x(t)e~^ 2lr ^ nt dt (4.4) 

a/->o \Af J A~»oo J 
-A/2 


Assuming that the limits exist, we write this as 


OO 

*(/■)= f x(t )e- j2 *f"'dt 

— OO 


(4.5) 


Since Af 0, the frequencies /„ in the above representation become a continuum, so 
we write / instead of /„. From Equation (4.4), X (/„) which is now expressed as X(f) 
is an amplitude divided by bandwidth or amplitude density which is called the Fourier 
integral or the Fourier transform of x(t ), written as 

OO 

X(f)= j x(t)e- J2nf, dt (4.6) 

— OO 


Now consider the corresponding change in the representation of x(t) as the sum of sines 
and cosines, i.e. x(t) — YffL-oo c„e j2,Tn ^ Tp . Using the above results. 


lint 

Af~>0 



= X{f n ) 
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x(t) can be rewritten as 

OO 

x(t) = lim X(f n )Af ■ e J 2 * nt ' T ' (4.7) 

* *’ n =— oo 

which can be represented in a continuous form as 

oo 

x(t)= f X(f)e i2nf, df (4.8) 

— OO 

Equations (4.6) and (4.8) are called the Fourier integral pair. 


Comments on the Fourier Integral 

1 . Interpretation and appearance of the Fourier transform: X(f) is a ( complex ) amplitude den- 
sity. From the representation x{t) = f°° X{f)e^ 2lt f' df , we see that \X(f)\ df represents 
the contribution in magnitude (to x(t)) of the frequency components in a narrow band near 
frequency /. Since X(f) is complex, we may write 

X(f) = X Re (f) + jX lm (f) = \X(f)\ el*” (4.9) 

where \X{f)\ is the magnitude (or amplitude) spectrum and /(/ ) is the phase spectrum. 
When x(t) is in volts, |X(/)| is in volts/Hz. 


If x(t) is real valued, X Re (f) is an even function and Xi m (/) is an odd function, and 
also |X(/)| is an even function while </>(/) is an odd function. A typical display of X(f ) 
may look like that shown in Figure 4.2. An alternative way of displaying X(f) is to use the 
‘polar (or Nyquist) diagram" as shown in Figure 4.3, where the positive frequency (+/) 
is drawn clockwise and the negative frequency (— /) is drawn anti-clockwise. Note the 
relationship between the ‘magnitude/phase’ pair with the ‘real/imaginary’ pair in these 
figures. 



Figure 4.2 Typical display of X(f ): (a) magnitude spectrum, (b) phase spectrum 
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Re, X Re (/) 


Figure 4.3 Polar (or Nyquist) diagram of X(f), equivalent to Figure 4.2 


2. We have chosen to derive Equations (4.6) and (4.8) using /(Hz). Often co is used and 
alternatives to the above equations are 


OO 

x{t) = 7 - J X((o)e im, dco 


OO 

and 

X(oj) = f x(t)e J,M dt 

— OO 


— OO 


(4.10) 


OO oo 

K(t)= J X(co)e ia “doi and X(w) = ^- J x(t)e~ im, dt (4.11) 


x(t) = L= [ X(co)e jm, dw and X(w) = ^L= f x(t)e~ j0J, dt (4.12) 
V2 7i J \/2n J 


So, the definition used must be noted carefully. Equations (4. 10) are a common alternative 
which we shall use when necessary, and Equations (4.11) and (4.12) are not used in this 
book. 

3. The inversion of the Fourier pair is often accomplished using the delta function. In order 
to be able to do this we need to use the properties of the delta function. Recall Equation 
(3.26), i.e. 

OO OO 

/ e±j2 * al dt=&(al or / e ± i a, dt = 2nS(a) 

— oo — OO 

We now demonstrate the inversion. We start with x(t) = f°° X{f)e^ 2n ^' df , multiply both 
sides by and integrate with respect to t. Then, we obtain 


OO OO OO oo oo 

/ x (t) e J2ngt dt = // X{f) e i 2 *V- g), dfdt= J X{f) J e J2 * if - s) ‘dtdf 


(4.13) 
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Using the property of the delta function (Equation (3.26)), the inner integral term on the 
right hand side of Equation (4.13) can be written as f ~ s,, dt = 8(f — g), and so 

the right hand side of Equation (4.13) becomes 

OO 

j X(f)8(f - g)df = X(g) (4.14) 

— OO 


Hence, equating this with the left hand side of Equation (4.13) gives X(g) = 
f x(t)e~i 2wgt dt, which proves the inverse. 

Similarly, x(t) can be obtained via inversion of X(f) using the delta function. That is, 
we start with X{f) = f°° x{t)e~i 2n f t dt , multiply both sides by e' 2n ^ u and integrate with 
respect to /: 


OO OO OO OO OO 

/ X W eM,f=j / x(t)e- j27rf, e j27rf,l dtdf = J x(t) J e^^'-^dfdt 


OO 

J x(t)8(ti — t)dt = x(ti) 

— OO 


(4.15) 


4. The sufficient conditions for the existence of a Fourier integral are usually given as 

OO 

J \x(t)\dt < oo (4.16) 

— OO 

but we shall transform functions failing to satisfy this condition using delta functions (see 
example (d) in Section 4.3). 


4.2 ENERGY SPECTRA 

Using an electrical analogy, if x{t) is the voltage across a unit resistor then the total 
energy dissipated in the resistor is x 2 (t)dt. This may be decomposed into a frequency 
distribution from the relationship given in Equation (4. 17), which is a form of Parseval’s 
theorem: 

OO OO 

f x 2 (t)dt= j \X(f)\ 2 df (4.17) 


This can be proved using the delta function, as given below: 

OO OO oo oo oo 

J x\t)dt = J x(t)x*(t)dt = f j f X{f l )e^f' , X*{f 2 )e- 22 ^ , dtdhdf 2 


-oo — oo — OO 


oo oo 

//« J 


Y(/t)Y (/2)S(/r - f 2 )dfidf 2 = / |Y(/i)| 2 rf/i 


OO 

/ 


(4.18) 


— oo — OO 
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Note that we are using energy here, whereas for Fourier series we talked of power (power 
spectra). The quantity | X(f)\ 2 is an energy spectra! density (energy per unit bandwidth) since 
it must be multiplied by a bandwidth df to give energy. It is a measure of the decomposition 
of the energy of the process over frequency. 


4.3 SOME EXAMPLES OF FOURIER TRANSFORMS 

Some examples are given below, which help to understand the properties of the Fourier 
transform: 

(a) The Fourier transform of the Dirac delta function 8(t ) is 

OO 

F{5(f)}= J 8(t)e~ i2nf, dt = e^ i2nf -° = 1 (4.19) 

— OO 

where F{} denotes the Fourier transform (shown in Figure 4.4). Note that the sifting 
property of the delta function is used (see Equation (3.23)). 

S(t) C{<5(0} 

l.ol 

.( ►/ 

Figure 4.4 Dirac delta function and its Fourier transform 

(4.20) 


(4.21) 


(b) For an exponentially decaying symmetric function 

x(t) = e~ Mt ', X > 0 




J e k, e- j2nf, dt + j 


= / e M e- ]mj 'dt + / e- k, e- j2nf, dt = 


2X 


X 2 + 47r 2 f 2 


The time history and the transform are shown in Figure 4.5. 



Figure 4.5 Time domain and frequency domain graphs of example (b) 
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Observations: 

(i) The time history is symmetric with respect to / = 0 (i.e. an even function), so the 
transform is entirely real (i.e. a cosine transform) and the phase is zero, i.e. a so-called 
zero-phase signal. 

(ii) The parameter X controls the shape of the signal and its transform. In the frequency 
domain the transform falls to 1 /2 of its value at / = 0 at a frequency / = X /2 it . So 
if X is large then x(f) is narrow in the time domain, but wide in the frequency domain 
and vice versa. This is an example of the so-called inverse spreading property of the 
Fourier transform, i.e. the wider in one domain, then the narrower in the other. 

(c) For an exponentially decaying function 

x(t) = e~ at t > 0, a > 0 


oo oo oo 

X(f) = J x(t)e- ]2nf, dt = J e~ a, e - i2nf, dt = J e-^+j^'dt 


1 


a + j^nf 


= \X(f)\e j ^ 


where 


|X(/)| = 1 = and </>(/) = tan 1 ( j 

Ja 2 + 4n 2 f 2 \ ct ) 


tJo. 2 + 47 r 2 / 2 

The time and frequency domains are shown in Figure 4.6. 


(4.23) 



Figure 4.6 Time domain and frequency domain graphs of example (c): (a) time domain, (b) 
magnitude spectrum, (c) phase spectrum 



For a sine function 


x(t) — A sin(27 xpt) 


(4.24) 


OO OO oo 

X(f) = J x(t)e~ i2llf, dt = J A sin 2 npt ■ e~ J27lf 'dt = j ■ ( eJl7lp ' ~ e- i27lp, )e~ i27lf, dt 

— OO — OO — oo 

oo 

= J [ e _j2ir(/_p) ' - ,-^dt = ~r m -P)~ S(f + p)] 

2 j J 2 j 


(4.25) 
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Figure 4.7 Time domain and frequency domain graphs of example (d) 

In this example, X(f) is non-zero at / = p and / = —p only, and is imaginary valued 
(the sine function is an odd function). The phase components are arg X(p) — —n/2 and 
arg X{— p) = n/2. This shows that a distinct frequency component results in spikes in 
the amplitude density. 


(e) For a rectangular pulse 


x(t) = a 1 1 1 < b 
= 0 \t\>b 


X(f)= f x{t)e-Wdt 

— oo 

lab sin(2 nfb) 

2 nfb 


b 

J ae- J2nf, dt 

-b 


(4.26) 


(4.27) 


x(t) 
a i 


-b b 


X(f) 



Figure 4.8 Time domain and frequency domain graphs of example (e) 

The expression for X(f) has been written so as to highlight the term sin(27r fb) /In fb, 
i.e. sin(jc)/x, which is the so-called sine function which is unity at x = 0, and thereafter 
is an amplitude-modulated oscillation, where the modulation is 1/x. The width (in time) 
of x(t) is 2b and the distance to the first zero crossing in the frequency domain is 1/2 b 
(as shown in Figure 4.8). This once again demonstrates the inverse spreading property. 

For the case a = 1, then as b — > oo, X(f) is more and more concentrated around 
/ = 0 and becomes taller and taller. In fact, lim^oo 2b sin(27rfb)/2nfb is another way 
of expressing the delta function 8(f), as we have seen in Chapter 3 (see Equation (3.27)). 
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(f) For a damped symmetrically oscillating function 

x(t) — cos 2^ /of, a > 0 

X(f)= J x(t)e- J2 * f, dt= J e- aU] cos2tt f 0 re- j2,,f 'dt= J e' 01 ' 1 * (e i2 * fo ‘ + e^ i2 * fo ')t 

= I j e - a ' n e-Mf-M'dt+ l - j e-^e-^f+M'dt 

a a 

■ + ■ 


a 2 + [2 jt(/ - /o)] 2 a 2 + [2 tt(/ + / 0 )] 2 

The time and frequency domains are shown in Figure 4.9. 


X(f) 



Figure 4.9 Time domain and frequency domain graphs of example (f) 

(g) For a damped oscillating function 

x(t) = e~ al sin27r/ot, t > 0 and a > 0 


oo oo oo 

X(/) = J x(t)e~ i2nfl dt = J e~ at sin2nr f 0 te~ j2,lf, dt = J e~ a ' ^ (, 

-oo 0 0 

oo oo 

= [ -[0+j2*(/-A)]f dt _ J_ [ e -{a+j2ir(f+mi . _ 

2j J 2 j J 


2jt/ 0 


0 0 

The time and frequency domains are shown in Figure 4.10. 


(2jt/ 0 ) 2 + (a + j2nf) 2 



(4.28) 

-j^f'dt 

(4.29) 


(4.30) 
r&ftdt 

(4.31) 


Figure 4.10 Time domain and frequency domain graphs of example (g): (a) time domain, (b) 
magnitude spectrum, (c) phase spectrum 
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(h) For the Gaussian pulse 


x{t) = e , a > 0 
X(a>) = yGr/a ■ e^ ,4a 


(4.32) 

(4.33) 


i.e. X{(o) is also a Gaussian pulse. Proof of Equation (4.33) is given below. We shall use 
X(cd) instead of X(f) for convenience. 

We start from 

OO OO 

X(w) = J e- a,2 e~ Jm, dt = J e - ai,2+i,0t/a) dt 

— oo — oo 

and multiply by . e “ 2 / 4a to complete the square, i.e. so that 

OO OO 

X(a>) = e - " 2 / 4 " • J g-«(t 2 +./®t/o-<*’ 2 / 4a2 )^t _ g^ 2 / 4 " . J e -a[t+j(ai/2a)] 2 

— oo — oo 

Now, let y = [t + y( tt) /2a)]; then finally we have 

OO 

= e-^ IAa ■ J e- ay2 dy = yfitja ■ 

— OO 

The time and frequency domains are shown in Figure 4.11. 


-<u 2 /4 a 


x{t) 


X(0» 


1 . 0 , 




Figure 4.11 Time domain and frequency domain graphs of example (h) 
(i) For a unit step function 


u{t) =1 t > 0 

= 0 t < 0 (4.34) 

The unit step function is not defined at t = 0, i.e. it has a discontinuity at this point. The 
Fourier transform of u{t) is given by 

F{u(t)}= l -S(f)+^— (4.35) 

2 j2nf 

where F{} denotes the Fourier transform (shown in Figure 4.12). The derivation of this 
result requires the use of the delta function and some properties of the Fourier transform. 
Details of the derivation can be found in Hsu (1970), if required. Also, note the presence 
of the delta function at / = 0, which indicates a d.c. component. 
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Figure 4.12 Time domain and frequency domain graphs of example (i): (a) time domain, (b) 
magnitude spectrum, (c) phase spectrum 


(j) For the Fourier transform of a periodic function 
If x(t ) is periodic with a period T P , then 


x(t) = 


OO 

J2 c n e ilnn ' ITp 

n=—o o 


(i.e. Equation (3.34)). The Fourier transform of this equation gives 


OO ^ 

c n e> 2 * nt l T <' e -i' 1 *f t dt 

— OO 
OO 

= E c " s (f - n / Tp ) 

n =— oo 


*(/)= [ E 

J n= — n 



e -j2n(f-n/T P )t dt 


(4.36) 


This shows that the Fourier transform of a periodic function is a series of delta functions 
scaled by c„, and located at multiples of the fundamental frequency, 1/7/. 


Note that, in examples (a), (b), (e), (f) and (h), argX(/) = 0 owing to the evenness of 
x(t). Some useful Fourier transform pairs are given in Table 4. 1 . 


4.4 PROPERTIES OF FOURIER TRANSFORMS 

We now list some important properties of Fourier transforms. Here F{x(t)} denotes X(f). 
(a) Time scaling: 


or 


F{x(at )} = -^—X(f/a) (4.37a) 

\a\ 

1 

F{x(at)} = — X(co/a) (4.37b) 

\a\ 


where a is a real constant. The proof is given below. 
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Table 4.1 

Some Fourier transform (integral) pairs 



Time function 

Fourier transform 


No. 

x(t) 

X(f) 


X(®) 

1 

m 

1 


1 

2 

i 

«(/) 


27t6(o;) 

3 

A 

AS(f) 


27rA5(du) 

4 

u(t) 

2 S(f)+ jlltf 


718 (a)) H 

jv 

5 

S(t - 1 0 ) 

e -W' o 


e-jv'o 

6 

e J 2 n fo l or e j(o 0 t 

S(f - fo) 


2 tt 8 (co — ooq) 

7 

cos(27t/oO or cos(cuoO 

- f 0 )+Kf+ m 

7T [<$(&; 

— coo) + 8 {co + cuo)] 

8 

9 

sin(27r/o0 or sin(a>oO 
e ~ a 1*1 

x 1 t[«/-/o)-5(/ + /o)] 

27 

2 a 

-l/(® 

7 

- (Oo) - 8 ( 0 ) + <Ho)l 

2 a 


a 2 + An 2 f 2 


a 2 + co 2 

10 

11 

12 

i 

Q! 2 + f 2 

x(t ) = e-°“u(t) 
x(t) = A 1 1 1 < T 

a 

1 


a 

1 

“ + 7'2rr/ 

„ sin(27r/r) 


O' + j CD 

sin (coT) 
o v 7 

= 0 \t\>T 

2^/r 


of 


13 

14 

15 

16 


2A/ 0 


sin(2jr/ot) sin(o> 0 f) 
or A- 


2 tt f 0 t 

£ c n e^' or £ c„ 


Tit 

Jnwot 


sgn(r) 

1 

t 


X(f) = A |/| < /„ 
= 0 |/| > f 0 

E c„S(f — nfo) 

n——oo 

1 

jxf 

-7'7TSgn(/) 


X(to) = A \co\ < to 0 
= 0 |gj| > coo 

2 7Z c n 8(co — ncoo ) 

n——oo 

2 

jo> 

-jn sgn(cu) 


For a > 0, the Fourier transform is F{x(at)} = f°° oo x(at)e ilnt, dt. Let at = r;Then 

OO 

1 r j 

F{x(at ) ) = - / x(T)e~- ,2lr(// “ )l dr = - X(f/a ) 
a J a 


Similarly for a < 0, 


— oo 

F{x(at)} = - f x(z)e~j 27l ^/ a ^ r dr 
a J 
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thus 

OO 

F{x(at)} = ~- f x(T)e-W f ' a)z dT = ±-X(f/a) 
a J |a| 

— OO 

That is, time scaling results in frequency scaling, again demonstrating the inverse spreading 
relationship. 

(b) Time reversal: 


F{x{-t)} = X(-f) (=X*(/), for x(/) real) 


(4.38a) 

(4.38b) 


F{x(-t)} = X(-a>) 

Proof. We start from F{x{— f)J = f 00 ^ x{—t)e^ i2n f<dt, let —t = r, then obtain 

— OO OO 

F[x(—t)} = — J x{x)e j2n f z dx — J x(T)e~ j2,r< ~^ )T dx = X(—f) 

OO — OO 

Note that if x(t) is real, then x*(t) = x(t). In this case, 

OO OO 

X(-f) = J x(t)e~ i2 ”(-f)t dt = J x *( t ) e J 2zz fdt = X*{f) 


This is called the conjugate symmetry property. 

It is interesting to note that the Fourier transform of X(—a>) is x(t), i.e. F{X(—a>)} — 
x(t ), and similarly F{X(u))} = x(—t). 

(c) Time shifting: 


F[x(t - r 0 )} = e-j 2 ”f‘°X{f) 


(4.39a) 


F{x(t - r 0 )! = e- jM «X(w) (4.39b) 

Proof: We start from F{x{t — fo)} = f 00 ^ x(t — dt , let t — to = x, then obtain 

OO OO 

FM, - ..)! = / - — *</) 


This important property is expanded upon in Section 4.5. 
(d) Modulation (or multiplication) property: 


(i) 


F{x{t)ei 2 *f°>} = X{f - f 0 ) 


F{x{t)e iaa ‘} = X{co - (o Q ) 

This property is usually known as the ‘frequency shifting' property. 


(4.40a) 

(4.40b) 
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Proof: 


OO OO 

F{x(f)e j2lrA '} = J x{t)e j2nfo, e- i27Z f'dt = J x(t)e- ]27Z <-f-M'dt = X(f - f Q ) 


(ii) 


F{x(f)cos(2jr/ 0 f)} = - [X(f - fo) + X(f + /o)] 


(4.41a) 


F{x(t) cos(o) 0 0! = - [X(m - u>o) + X(a> + mo)] 


(4.41b) 


This characterizes ‘amplitude modulation". For communication systems, usually x(f) 
is a low-frequency signal, and cos(2tt/o/) is a high-frequency carrier signal. 


Proof: 


F{x(t) cos(27r/oOI = F^x(t)e j27T f°> + ix(t)e^' 2,r/o ' J 

= l -F{x{t)e i2 ^'} +\F{x(t)e-^k>) 

= \\x{f-h)+x<j + m 


(e) Differentiation: 


= j2nfX(f) (if x(t) -> 0 as t -> ±oo) 
F{x(f)} = ja>X(a>) 


(4.42a) 

(4.42b) 


Proof: 


F{x(f)} = J x(t)e i 2n f' d t — x (t) e 4 . j2nf J x(t)e F^f'dt 

— 00 — OO 

Since x(f) — > 0 as t — > ± 00 , the first part of the right hand side diminishes. Thus 

OO 


(f) The Fourier transform of the ‘convolution’ of two functions: 


F{h{t) * x(f)} = H(f)X(f) 

where the convolution of the two functions h(t) and x(t) is defined as 


OC 

/ 


h(t)*x(t)= / h(x)x(t — x)dx 


(4.43) 


(4.44) 
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The property of Equation (4.43) is very important in linear system theory and is explained 
fully in Section 4.7. 


Proof: Let t — r = v. Then 

oo oo 

F{h{t)*x(t)} = // h(r)x(t — t)e i 2n f'dTdt 


-oo — OO 

oo oo 


-// hft)x{v)e i2n f lj+ ' , ' , dTdv 

— oo — oo 

oo oo 

= J h(r)e- i2 * fz dr J x(v)e~ i2wfv dv = H(f)X(f) 


(g) The Fourier transform of the ‘product’ of two functions: 


OO 

F{x{t)w(t)} = J X(g)W(f - g)dg = X(f) * W(f) 

— OO 


(4.45) 


This is also a very important property, and will be examined in detail in Section 4.11. 

Proof: We start from F{x(t)w(t)} = x(t)w(t)e^ j2n ^'dt. If x(t) and w(t) both have 
Fourier representations, then the right hand side is 


OO OO OO 


j x(t)w(t)e- j27lf, dt = J J J Xifde&toWtfdeP*** ■ e^‘ dfidf 2 dt 

— oo — oo — oo — OO 

oo oo oo 

= J X(/i) J W(f 2 ) J e-M-A-Mdtdhdf! 


— oo — oo 

oo oo 


W(f 2 )S(f - /i - f 2 )df 2 df, 


-/«"/ 

— oo — oo 

oo 

= f X(f l )W(f-f l )df 1 =X(f)*W(f) 


4.5 THE IMPORTANCE OF PHASE 

In many cases, we sometimes only draw the magnitude spectral density, \X(f)\, and 
not the phase spectral density, arg X(f) = </>(/). However, in order to reconstruct a 
signal we need both. An infinite number of different-looking signals may have the same 
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magnitude spectra - it is their phase structure that differs. We now make a few general 
comments: 

1. A symmetrical signal has a real- valued transform, i.e. its phase is zero. We saw this 
property in examples given in Section 4.3. 

2. A pure delay imposed on a signal results in a linear phase change to the transform 
(see property (c) in Section 4.4). An example of this is illustrated in Figure 4.13. 



ar g A(/) = 0 


argT(/) =-2nfr 



Figure 4.13 The effect of a pure delay on a zero-phase signal 


The slope of the phase curve gives the delay, i.e. dif/df — —2nto, or dcf/dco = 
—to. Specifically, the quantity —dtp/dco = to is known as the group delay of the signal. 
In the above case, the delay is the same for all frequencies due to the pure delay (i.e. 
there is no dispersion). The reason for the term group delay is given in Section 4.8. 

3. If the phase curve is nonlinear, i.e. —d<t>/da> is a nonlinear function of to, then the 
signal shape is altered. 


4.6 ECHOES'" 141 

If a signal y(t) contains a pure echo (a scaled replica of the main signal), it may be 
modelled as 

y{t) = x(t) + ax(t - t 0 ) (4.46) 

where x{t) is the main signal and axlt — to) is the echo, a is the amplitude of the echo, 
and to is called the ‘epoch’ of the echo (i.e. the time delay of the echo relative to the main 
signal). A typical example may be illustrated as shown in Figure 4.14, and the Fourier 
transform of y(t) is 

Y{f) = (1 + ae~ i27lf,0 )X(f) (4.47) 
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Hard reflector 



Figure 4.14 Example of a signal containing a pure echo 


The term (1 + ae~ ;2lr ^' 0 ) is a function of frequency and has an oscillatory form 
in both magnitude and phase. This describes the effect of the echo on the main 
signal, and may be illustrated as shown in Figure 4.15. The magnitude of Y(f) is 
7(1 + a 2 + 2a cos 2nft 0 ) |X(/)| where an oscillatory form is imposed on \X(f)\ due 
to the echo. Thus, such a ‘rippling’ appearance in energy (or power) spectra may indicate 
the existence of an echo. However, additional echoes and dispersion result in more com- 
plicated features. The autocorrelation function can also be used to detect the time delays 
of echoes in a signal (the correlation function will be discussed in Part II of this book), 
but are usually limited to wideband signals (e.g. a pulse-like signal). Another approach 
to analysing such signals is ‘cepstral analysis’ (Bogert et al., 1963) later generalized as 
homomorphic deconvolution (Oppenheim and Schafer, 1975). 

\Y(f)\ 



4.7 CONTINUOUS-TIME LINEAR TIME-INVARIANT SYSTEMS 
AND CONVOLUTION 

Consider the input-output relationship for a linear time-invariant (LTI) system as shown 
in Figure 4.16. 


Input 


x(t) 


System 


y(t) 


Output 


Figure 4.16 A continuous LTI system 


We now define the terms ‘linear’ and ‘time-invariant’. 
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Linearity 

Let y\(t) and y2(0 be the responses of the system to inputs x\(t) and X 2 (f), respectively. 
If the system is linear it satisfies the properties in Figure 4.17, where a is an arbitrary 
constant. 


(i) Additivity: x,(t) + x 2 (t) 

(ii) Scaling (or homogeneity): ax^t) 



y,(0 +yjj) 

ay,(0 


Figure 4.17 Properties of a linear system 

Or the two properties can be combined to give a more general expression that 
is known as the ‘superposition property’ (Figure 4.18), where a\ and a 2 are arbitrary 
constants. 


OiXi(0 + a 2 x 2 (0 


Linear system 


ATiW + ajTaP) 


Figure 4.18 Superposition property of a linear system 


Time Invariance 

A time-invariant system may be illustrated as in Figure 4.19, such that if the input is 
shifted by to, then the response will also be shifted by the same amount of time. 


x(t-Q 


Time-invariant system 




Figure 4.19 Property of a time-invariant system 


Mathematical Characterization of an LTI System 

Very commonly LTI systems are described in differential equation form. The forced vibra- 
tion of a single-degree-of-freedom system is a typical example, which may be expressed 
as 


my(t) + cy(t) + ky(t) = x(t ) (4.48) 

where x(t) is the input and y(t) is the output of the system. 

Relating y(t) to x{t) in the time domain then requires the solution of the differential 
equation. Transformation (Laplace and Fourier) techniques allow a ‘systems approach’ 
with the input/response relationships described by transfer functions or frequency re- 
sponse functions. 

We shall use a general approach to linear system characterization that does not 
require a differential equation format. We could characterize a system in terms of its 
response to specific inputs, e.g. a step input or a harmonic input, but we shall find that 
the response to an ideal impulse (the Dirac delta function) turns out to be very helpful - 
even though such an input is a mathematical idealization. 
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We define the response of a linear system to a unit impulse at f = 0 (i.e. S(t )) to be 
hit). See Figure 4.20. In the figure, it can be seen that the system only responds after the 
impulse, i.e. we assume that the system is causal, in other words hit) = 0 for f < 0. For 
a causal system, the output y(f) at the present time, say f = t\, is dependent upon only 
the past and present values of the input x(t), i.e. x(f) for t <t\, and does not depend on 
the future values of x(f). 


x(t)=S(t) 
1.0 • • 


LTI system, h(t) 


yiO = HO 



Figure 4.20 Impulse response of a system 

We shall now show how the concept of the ideal impulse response function hit) 
can be used to describe the system response to any input. We start by noting that for a 
time-invariant system, the response to a delayed impulse 5(f — t\) is a delayed impulse 
response h(t — fi). 

Consider an arbitrary input signal x{t) split up into elemental impulses as given 
in Figure 4.21. The impulse at time t\ is x(fi)Afi. Because the system is linear, the 
response to this impulse at time t is h{t — fi)x(fi)Afi. Now, adding all the responses to 
such impulses, the total response of y(t) at time f (the present) becomes 

y{t) ^2 h(t - h)x(t\)Ah (4.49) 

and by letting Afj — > 0 this results in 


y(0= f h(t — h)x(t\)dt\ (4.50) 

— oo 

Note that the upper limit is f because we assume that the system is causal. Using the 
substitution t — t\ = x (—dt\ = dr), the expression can be written in an alternative 
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form as given by Equation (4.51a), i.e. the convolution integral has the commutative 
property 


l 

I 


7 


y(f) = h(t — ti)x(ti)dti = — / h(r)x(t — r)dr = / h{z)x{t — x)dx (4.51a) 


OO 

S' 


or simply 


y(t) = x(t) * h(t) = h(t) * x(t) (4.51b) 

As depicted in Figure 4.22, we see h{x) in its role as a ‘memory’ or weighting function. 



Figure 4.22 The impulse response function as a ‘memory’ 

If the input x(t) is zero for t < 0, the response of a causal system is 

r t 


y(f) = 


j h(x)x(t — x)dx — 


J h(t — x)x(x)dx 


(4.52) 


o o 

And, if the system is non-causal, i.e. the system also responds to future inputs, the 
convolution integrals are 


y(t) = 


oo 

J h{x)x{t — x)dx — 


OO 

J h(t — x)x(x)dx 


— OO 


— OO 


(4.53) 


An example of convolution operation of a causal input and a causal LTI system is illus- 
trated in Figure 4.23. 

We note that, obviously, 


h{t) = hit) * S(t) = 


OO 

J hix)Sit — x )dx 


— OO 


(4.54) 


The convolution integral also satisfies ‘associative’ and ‘distributive’ properties, i.e. 

Associative: [x(r) * /r i (?)] * /*2(0 = x(t) * [h\it) * / 12 (f)] (4.55) 

Distributive: x(f) * [/z 1 (Z) + / 12 (f)] = x(f) * /ii(f) + x(f) * / 12 (f) (4.56) 
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m 



h(t) 


h(t - T) x(T) 


h(t-T) 

j 

Integral of h{t - t)x(t) 
i.e. the value of the cor 

y(t) = x(t)*h(t) 

/ Tm 

/ 

lvolution 

X 

at t ^ 




Figure 4.23 Illustr. 

t 

ttions of a convolution operation 


The Frequency Response Function 


Consider the steady state response of a system to a harmonic excitation, i.e. let x(t) = 
gjixft' Then the convolution integral becomes 

oo oo oo 

m = / KrMl - iW " / = ‘ nn " I 

0 0 o (4.57) 

mf) 

The system response to frequency / is embodied in H(f) = / 0 °° h(T)e~i 2lz f z dr , which 
is the system ‘frequency response function (FRF)’. 

The expression of the convolution operation in the time domain is very much simpli- 
fied when the integral transform (Laplace or Fourier transform) is taken. If the response 
is y(t) = h(r)x(t — z)dr, then taking the Fourier transform gives 


Y(f) 


OO OO 


-// 

— oo 0 


h(z)x(t — r)e - ,27r ^ f drdt 


Let t — r = u\ then 

OO OO 

Y(f) = J h(r)e~ i2nfz dr J x(u)e - j2lrfu du 

0 — oo 
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Thus, 


Y(f) = Hif)X(f) (4.58) 

The convolution operation becomes a ‘product’ (see property (f) in Section 4.4). H(f) 
is the Fourier transform of the impulse response function and is the frequency response 
function of the system. Sometimes, Equation (4.58) is used to ‘identify’ a system if the 
input and response are all available, i.e. H(f) = Y(f)/X(f). Following on from this the 
relationship between the input and output energy spectra is 

l>'(/)| 2 = l^(/)| 2 |^(/)| 2 (4.59) 


If the Laplace transform is taken (the Laplace transform will be discussed further in 
Section 5.1), then by a similar argument as for the Fourier transform, it becomes 

y(s) = H{s)X{s) (4.60) 

where s = a + jca is complex. The ratio T(i)/Y(i) = H(s) is called the transfer function of 
the system. The relationships between the impulse response function, the frequency response 
function and the transfer function are depicted in Figure 4.24. Note that H(a>) can be obtained 
by H(s) on the imaginary axis in the s-plane, i.e. the Fourier transform can be considered as 
the Laplace transform taking the values on the imaginary axis only (see Section 5.1). 



Figure 4.24 Relationship between h(t), H(w) and H(s) 

y(f). 

(4.61) 

(4.62) 


Examples of Systems 
Example 1 

Reconsider the simple acoustic problem in Figure 4.25, with input x(t) and response 
The relationship between x(t) and y(t) may be modelled as 

y(t) = ax{t — Ai) + bx(t — A 2 ) 

The impulse response function relating x{t) to y(t) is 

h(t) = aS(t — Ai) + bS(t — A 2 ) 


and is illustrated in Figure 4.26. 
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Hard reflector 



Figure 4.25 A simple acoustic example 
Ki) 


Ai A 2 

Figure 4.26 Impulse response function for Example 1 


The frequency response function is 


oo 

H(a>) = J h(t)e~i m, dt = ae~ iaA ' + be~ JaAl - ae~ iaA ' ^1 + 


(4.63) 


If we let A = A 2 — Ai, then the modulus of H(co) is 


, , b 2 2b 

\H{a>)\ = aj ( 1 + — 4 cos wA 


(4.64) 


This has an oscillatory form in frequency (compare this with the case depicted in Figure 4. 15). 
The phase component arg H(a>) also has an oscillatory behaviour as expected from Equation 
(4.63). These characteristics of the frequency response function are illustrated in Figure 4.27, 
where H(co) is represented as a vector on a polar diagram. 

Next, applying the Laplace transform to h(t), the transfer function is 

OO 

H(s) = J h(t)e- s, dt = ae~ sA ' + be~ sAl (4.65) 

— OO 


Now we shall examine the poles and zeros in the 5 -plane. From Equation (4.65), it can be seen 
that there are no poles. Zeros are found, such that H(s) = 0 when ae~ sA] = —be~ sAl , i.e. at 


e sA = 


b 

a 


(4.66) 


where A = A 2 — A\. Let 5 = 0 + jco so that Equation (4.66) can be written as 


e aA e imA = 


^ e ±j(n+2kit ) 

a 


(4.67) 
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Figure 4.27 Polar diagram of H(e>) for Example 1 (where Ai 


1 , A 2 = 4 and a/b = 2) 


where k is an integer. Since e aA = b/a and coA = ± jn(2k + 1 ), zeros are located at 

1 ( b\ it 

<r=-ln(-), a) = ±j—(2k + 1) (4.68) 

A \a / A 

and are depicted in Figure 4.28. 

In the figure, the corresponding oscillatory nature of the modulus of the frequency re- 
sponse function is seen, as it is in the phase. However, the phase has a superimposed linear 


jco S S 



Figure 4.28 Representation in the v- plane and its corresponding H(o>) for Example 1 
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component due to the delay A! of the first ‘spike’ in the impulse response function (see Figure 
4.26). 


Example 2 

Consider the single-degree-of-freedom mechanical system as given in Equation (4.48), which 
can be rewritten in the following form: 

7 1 

y(t) + co„y{t) + oi n y(t) = —x(t) (4.69) 

m 

where co n = Jk/m and f = c/2s/km. The impulse response function can be obtained from 
h(t) + 2fco n h(t) + o) 2 h(t) = ( ), and assuming that the system is underdamped (i.e. 
0 < f < 1), the impulse response function is 

h(t) — smcojt (4.70) 

ma>d 

where = (o ny J\ — f 2 , and is illustrated in Figure 4.29. 



Figure 4.29 Impulse response function for Example 2 


The corresponding frequency response function and transfer function are 

1/m 


H(a» = 
H(s) = 


ojI - of + j2^co n co 
1/m 


s 2 + 2 $co n s + co 2 

Note that there are only poles in the x-plane for this case as shown in Figure 4.30. 



(4.71) 

(4.72) 


Figure 4.30 Representation in the s-plane and its corresponding // (to) for Example 2 
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4.8 GROUP DELAY 1 (DISPERSION)™ 4 2 

We have seen that a pure delay results in a linear phase component. We now interpret nonlinear 
phase characteristics. Suppose we have a system H(co) = A((o)e 2 ^ w \ where A(co) (= | //(a>)|) 
is amplitude and cj>(co) is phase. Consider a group of frequencies near co k in the range from 
a>k — B to oik + B (B <£ oik), i.e. a narrow-band element of H(a>), approximated by Hk((o), as 
shown in Figure 4.31, i.e. H{co) & 'ffk H k (co), and H k (co) = \H k (co)\ e > arg Hk(ai) = A(cOk)e^ w \ 
The phase (j>{(o) may be linearly approximated over the narrow frequency interval (by applying 
the Taylor expansion) such that 4>{co) & <t>((Ok) + (co — co^cj) 1 {(o k ) as shown in Figure 4.32. 
Then, H k (co) has the form of an ideal band-pass filter with a linear phase characteristic. 


\H k (m)\ arg H k (m) 


A(m t ) 




4H) 



* 

co k 

^ 2B 7 -^ 


~a>k 

- ™ <j>(-w t )=-<k K) 


(a) (b) 


Figure 4.31 Narrow-band frequency components of H k (co): (a) magnitude, (b) phase 


arg H k (a>) 


N 




N 

slope = /’(ffl t ) 


Figure 4.32 Linear approximation of arg H k (o>) 

Now, based on the representation H (co) & J^k A(cok)e J ^ (a,t)+(m ~ a,t)</> ' ( ‘ ot ^ we shall inverse 
transform this to obtain a corresponding expression for the impulse response function. We 
start by noting that the ‘equivalent’ low-pass filter can be described as in Figure 4.33(a) whose 
corresponding time signal is 2A(a> k )B sin[B(? + (p'(co k ))]/[n B(t + (j>'(a>k))] (see Equation 
(4.39b) and No. 13 of Table 4.1). Now, consider the Fourier transform of a cosine function 
with a phase, i.e. F{cos((o k t + </>(&>*,))} = n\e^^S(co — co k ) + e^ 2 ^ mi ^S(yo + (Okf\ as shown 
in Figure 4.33(b). In fact, H k ((o ) can be obtained by taking the convolution of Figures 4.33(a) 
and (b) in the frequency domain. This may be justified by noting that the frequency domain 
convolution described in Equation (4.45) can be rewritten as 

oo oo 

X(f)*W(f)= I X(g)W(f - g)dg = f \X(g)\ertxW\W(f - gye^V-^dg 

— oo — oo 

oo 

= J \X(g)\ ■ \W(f - gye^x^+^-^dg (4.73) 

— OO 


See Zadeh and Desoer (1963); Papoulis (1977). 
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' (a) Equivalent low-pass filter 

2 A(co t ) 

Magnitude: 


(b) Fourier transfonn of carrier \ 


-B 


B 


Phase: 


4 



71 


71 , 


* 





(Convolution) 

-a> k 







^Slope = /(o t ) 




4>{w k ), 


■d 


2 A(co k )B 


-2A(co k )B 


(c) Time domain representation 
sin[g(? + f(ffi> t ))] 

7rB(t + 0'(w k )) 

sin j B if -t g (w k ))J (Multiplication) 


cos (w k t + 0(w k )) = cosw k f t + ^ C ° k - 
= costa k (t-t p {a> k j) 


7tB(t-t e (m k j) 

Figure 4.33 Frequency and time domain representation of H k ((o) 


Thus, the frequency domain convolution (Equation (4.73)) may be interpreted in the form that 
the resultant magnitude is the running sum of the multiplication of two magnitude functions 
while the resultant phase is the running sum of the addition of two phase functions. 

Since the convolution in the frequency domain results in the multiplication in the time 
domain (see Equation (4.45)) as depicted in Figure 4.33(c), the inverse Fourier transform of 
H k (a>) becomes 

. sin \B it + ij>'{u>k)\\ 

F- 1 {H k {a>)} « 2A(co k )B L V , cos (co k t + cj>(co k )) (4.74) 

nB (t + 4>’(a> k )) 

and finally, the inverse Fourier transform of H(a > ) is 

, ^ sin \B (t — t„(a>it))l 

h(t) = F - 1 {#(«)} « YlA{co k )B— *— CO sw k {t - t p (a> k )) (4.75) 

V n B (t - t g (co k )) 

' V J ' V J 

envelope carrier 

where t s and t p are the ‘group delay’ and ‘phase delay’ respectively, and are defined by 
Equations (4.76) and (4.77). The relationship between these two properties is illustrated in 
Figure 4.34. 


t g (w) = - 


dcj>{(o) 

da> 


tpiw) = 


</>0) 

CO 


(4.76) 


(4.77) 
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Figure 4.34 Illustrations of group delay and phase delay in the frequency domain 


Note that each signal component given in Equation (4.75) is an amplitude modulation 
signal where the ‘envelope’ is delayed by t g , while the ‘carrier’ is delayed by t p . This is 
illustrated in Figure 4.35. As shown in the figure, the phase delay gives the time delay of 
each sinusoidal component while the group delay can be interpreted as the time delay of the 
amplitude envelope (or the group of sinusoidal components within a small frequency band 
centred at a> k ). The delays are a continuous function of a>, i.e. they may have different values 
at different frequencies. This deviation of the group delay away from a constant indicates the 
degree of nonlinearity of the phase. If a system has a non-constant group delay, each frequency 
component in the input is delayed differently, so the shape of output signal will be different 
from the input. This phenomenon is called the dispersion. In our simple acoustic models (e.g. 
Figure 4.25), a single path is non-dispersive, but the inclusion of an echo results in a nonlinear 
phase characteristic. Most structural systems exhibit dispersive characteristics. 

In the case of a pure delay, the group delay and the phase delay are the same as shown 
in Figure 4.36 (compare the carrier signal with that in Figure 4.35 where the group delay and 
the phase delay are different). 

Directly allied concepts in sound and vibration are the group velocity and the phase 
velocity of a wave, which are defined by 

da> 

Group velocity of a wave: v g = — (4.78) 


Phase velocity of a wave: v p = 


a> 

k 


(4.79) 



Figure 4.35 Illustrations of group delay and phase delay in the time domain 
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Figure 4.36 The case of pure delay (group delay and phase delay are the same) 


where id is the wave’s angular frequency, and k = 2n /X is the angular wave number (X is 
the wavelength in the medium). The group velocity and the phase velocity are the same for 
a non-dispersive wave. Since velocity is distance divided by time taken, the group delay is 
related to the group velocity of a wave and the phase delay to the phase velocity. 


4.9 MINIMUM AND NON-MINIMUM PHASE SYSTEMS 


All-pass Filter 

We shall now consider the phase characteristics of a special filter (system). Suppose we 
have a filter with transfer function 

H{s) = S -^ (4.80) 

s + a 

The pole-zero map on the s-plane is shown in Figure 4.37. 


ja> 



s-plane 


Figure 4.37 The pole-zero map of Equation (4.80) 


Equation (4.80) may be rewritten as 


H(s) = 1 - 


2 a 


(4.81) 


s + a 

Then, taking the inverse Laplace transform gives the impulse response function 

h(t) = 5(f) - 2ae~ at (4.82) 

which is depicted in Figure 4.38. 
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Figure 4.38 Impulse response function of the all-pass filter 


The corresponding frequency response function is 


Thus, the modulus of H(a>) is 


H{a>) = 


ja> ~ a 

j co + a 


(4.83) 


\H(cd)\ = 


\l or + a 1 
\J cd- T a- 


= 1 


(4.84) 


This implies that there is no amplitude distortion through this filter. So, it is called the 
‘all-pass filter’. But note that the phase of the filter is nonlinear as given in Equation 
(4.85) and Figure 4.39. So, the all-pass filter distorts the shape of the input signal. 

arg H{co) = arg (jco — a) — arg(_/a> + a) = n — 2tan~' (a> > 0) (4.85) 


arg H(w) 



Figure 4.39 Phase characteristic of the all-pass filter 


From Equation (4.85), the group delay of the all-pass system is 

d 2 

(argtf(w))= — — y— yr 

dco a (1 + or la 1 ) 

Note that the group delay is always positive as shown in Figure 4.40. 



(4.86) 


Figure 4.40 Group delay of the all-pass filter (shown for o>> 0) 
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x(t) 


All-pass system 


y(t) 


Figure 4.41 Input-output relationship of the all-pass system 


Now, suppose that the response of an all-pass system to an input x{t) is y(t) as in 
Figure 4.41. 

Then, the following properties are obtained: 



OO 

OO 


(i) 

J \x(t)\ 2 dt = 

J \y(t)\ 2 dt 

(4.87) 


— OO 

—00 



to 

p 

to 

p 


(ii) 

I \x(t)\ 2 dt > 

j \yit)\ 2 dt 

(4.88) 


— OO 

—00 



The first Equation (4.87) follows directly from Parseval’s theorem. The second Equation (4.88) 
implies that the energy ‘build-up’ in the input is more rapid than in the output, and the proof 
is as follows. Let yi(f) be the output of the system to the input 


x\ (f) = x(t), t < t 0 


= 0 t > t 0 


Then for t < to. 


I 

~ f 


l 

-I 


yi(t) = / h(t — x)x\{r)dT = / h(t — x)x(r)dx = y(t) 


Applying Equation (4.87) to input Xi(t) and output y\(t), then 

to OO to oo 

J \x\{t)\ 2 dt= J |yi(t)| 2 dt = J \yi(t)\ 2 dt + J |yi(t)| 2 dr 


(4.89) 


(4.90) 


Thus, Equation (4.88) follows because x(t) = x\ (t) and y(t) — yi (f) for t < to. 


Minimum and Non-minimum Phase Systems 

A stable causal system has all its poles in the left half of the j -plane. This is referred 
to as BIBO (Bounded Input/Bounded Output) stable, i.e. the output will be bounded for 
every bounded input to the system. For the time domain condition for BIBO stability, the 
necessary and sufficient condition is j°° \h(t)\ dt < oo. We now assume that the system 
is causal and satisfies the BIBO stability criterion. Then, systems may be classified by 
the structure of the poles and zeros as follows: a system with all its poles and zeros in 
the left half of the s-plane is a minimum phase system; a system with all its zeros in the 
right half of the .s-plane is a maximum phase system; a system with some zeros in the 
left and some in the right half plane is a mixed phase (or non-minimum phase) system. 
The meaning of ‘minimum phase’ will be explained shortly. 
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Consider the following (stable) maximum phase system which has poles and a zero 
as shown in Figure 4.42: 


H(s) = 


s — a 

s 2 + 2fco„.s + co 2 


(4.91) 



.v-plane 


Figure 4.42 The pole-zero map of Equation (4.91) 


This may be expressed as 


H(s) = 


s + a 


s 2 + 2 £a>„s + a>„ / \.s + a 


— //min L ) fLipis ) 


(4.92) 


where i/ m i n (s) is the minimum phase system with | // m i n (a>)| = | H(ai ) |, and H^ v (s) is the 
all-pass system with | H a p (o>)| = L This decomposition is very useful when dealing with 
‘inverse’ problems (Oppenheim et al., 1999). Note that the direct inversion of the system, 
H ~’(s), has a pole in the right half of the s-plane, so the system is unstable. On the other 
hand, the inverse of a minimum phase system, 7/ mi [ l (.s), is always stable. 


The term ‘minimum phase’ may be explained by comparing two systems, Hi(s) = 
C? + a)/D(s) and TACO = 0 — a)/D(s). Both systems have the same pole structure but the 
zeros are at —a and a respectively, so the phase of the system is 

arg/fi(&>) = tan -1 ^ — ^ — arg D(a>) (4.93) 

arg 7 / 2 ( 01 ) = 7t — tan -1 ^ — ^ — arg D(o) (4.94) 

Comparing tan -1 (a>/a) and n — tan -1 (o/a), it can be easily seen that arg //i(o) < arg H 2 (cl>) 
as shown in Figure 4.43. 



Figure 4.43 Phase characteristics of Hi(a>) and H 2 (a>) 
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5-plane 


Figure 4.44 Phase characteristics of Hi(s) and H 2 (s) 


Or, the angles in the v-plane show that a < f) as shown in Figure 4.44. This implies that 
Hi(s ) is minimum phase, since ‘phase of Hi(s) < phase of H 2 (s)’ . 


It follows that, if H(s) is a stable transfer function with zeros anywhere and //mints) 
is a minimum phase system with \H{to)\ — \ then the group delay of H(s), 

— d arg H(cL>)/da>, is larger than — d arg H m i n (a))/d to. Also, if input x{t) is applied to 
arbitrary system H(s) giving response y(t) and to //mi n (s) giving response y m i n (/), then 
for any to the following energy relationship is given: 

to t 0 

f \y(t)\ 2 dt > f \ ymin (t)\ 2 dt (4.95) 

— oo — oo 

As a practical example, consider the cantilever beam excited by a shaker as shown in 
Figure 4.45. Let the signal from the force transducer be the input x{t), and the signals 
from the accelerometers be the outputs y\ (f) and y 2 (f) for positions 1 and 2 respectively. 
Also, let H\(to) and // 2 (a)) be the frequency response functions between x(t) and yi(t), 
and between x(t) and y 2 (t) respectively. 


Position 1 Accelerometer Position 2 


Shaker 


JT' 


Force transducer 


Figure 4.45 Cantilever beam excited by a shaker 


If the input and the output are collocated (i.e. measured at the same point) the fre- 
quency response function H\ (to) is minimum phase, and if they are non-collocated the 
frequency response function H 2 (a>) is non-minimum phase (Lee, 2000). Typical charac- 
teristics of the accelerance frequency response functions H\(to) and H 2 (a>) are shown in 
Figure 4.46. Note that the minimum phase system H\(co) shows distinct anti-resonances 
with a phase response over 0 < arg H\(to) < it . 
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\H,(co)\ 
dB 



arg Hj(to) 
X 


0 


la n 


(a) Minimum phase system 


\H 2 «o)\ 



o' 


-n 

-2n 


\ 


(b) Non-minimum phase system 


Figure 4.46 Frequency response functions of the system in Figure 4.45 


4.10 THE HILBERT TRANSFORM M4 ^ 5 

Consider the input-output relationship as described in Figure 4.47. 

+y(.t)=m 


x(t)- 


h(t) = — 
m 


Figure 4.47 Input-output relationship of the 90° phase shifter 
The output of the system is the convolution of x(t) with l/nf. 

1 

x(t) = h(t) * x{t) = - * x(t) 

nt 


(4.96) 


This operation is called the Hilbert transform. Note that h(t) is a non-causal filter with 
a singularity at t = 0. The Fourier transform of the above convolution operation can be 
written as 

X{co) = H{co)X(w) (4.97) 

where H(a>) is the Fourier transform of 1 /nt , which is given by (see No. 16 of Table 4. 1 ) 

—j for a) > 0 

Hied) = — ysgn(ai) = 


for to < 0 
for id = 0 


(4.98a) 


//(«) = 


e -j(n/2) for co > 0 


„/(*/ 2 ) 


for to < 0 
for to = 0 


(4.98b) 
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From Equation (4.98), it can be seen that 


|H(a>)l — 1 for a U except oi = 0 (4.99) 


arg H (oi) = 


—n/2 for oi > 0 

tt/ 2 for oi < 0 


(4.100) 


Thus, the Hilbert transform is often referred to as a 90° phase shifter. For example, the 
Hilbert transform of cosa>of is sina>of, and that of sincuof is — coscooi. 

The significance of the Hilbert transform is that it is used to form the so called 
‘analytic signal’ or ‘pre-envelope signal’. An analytic signal is a complex time signal 
whose real part is the original signal*)! ) and where imaginary part is the Hilbert transform 
of x(t), i.e. x(t). Thus, the analytic signal a x (t) is defined as 


a x (t) = x{t) + j x(t) (4.101) 

The Fourier transform of analytic signal F{a x {t)} is zero for oi < 0, and is 2X(a>) for 
ai > 0 and X(oi) for oi = 0. Since the analytic signal is complex, it can be expressed as 

a x (t) = AAt)e jM,) (4.102) 

where A x (t) — -Jx 2 {t) + x 2 (t) is the instantaneous amplitude, and 4> x (t) — tan -1 
( x(t)/x(t )) is the instantaneous phase. The time derivative of the unwrapped instan- 
taneous phase oi x (t) = <p x (t) = d<j> x (t)/dt is called the instantaneous frequency. For 
a trivial case x{t) = costuof, the analytic signal is a x (t) — e JW °' where A x {t) = 1 and 
oi x (t) — oi o, i.e. both are constants as expected. These concepts of instantaneous ampli- 
tude, phase and frequency are particularly useful for amplitude-modulated and frequency- 
modulated signals. 


To visualize these concepts, consider the following amplitude-modulated signal M4 3 

x(t) = m(t) cos oi c t = (A c + A m sin &)„,!) cos ay! (4.103) 

where oi c > oi m . We note that if m(t ) is band-limited and has a maximum frequency less 
than oi c , the Hilbert transform of x(t) = m(t) cos ai c t is x(t) = m(t) sin oi c t. Then, using the 
relationship between Equations (4.101) and (4.102), the analytic signal can be written as 

a x (t) = A x (t)e^ m = (A c + A m sin oi m t) e^‘ (4. 104) 

and the corresponding A x (t), 4> x (t) and oi x (t) are as shown in Figure 4.48. 

In sound and vibration engineering, a practical application of the Hilbert transform re- 
lated to amplitude modulation/demodulation is ‘envelope analysis’ (Randall, 1987), where the 
demodulation refers to a technique that extracts the modulating components, e.g. extracting 
A m sin oi m t from Equation (4.103). Envelope analysis is used for the early detection of a ma- 
chine fault. For example, a fault in an outer race of a rolling bearing may generate a series of 
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&(0 (O x (f) 



(b) Instantaneous (unwrapped) phase (c) Instantaneous frequency 

Figure 4.48 Analytic signal associated with the amplitude-modulated signal 


burst signals at a regular interval. Such burst signals decay very quickly and contain relatively 
small energies, thus the usual Fourier analysis may not reveal the repetition frequency of the 
bursts. However, it may be possible to detect this frequency component by forming the analytic 
signal and then applying Fourier analysis to the envelope A x (t). 


Examples 

Example 1: Estimation of damping from time domain records 
of an oscillator M4 4 

Suppose we have a free response of a damped single-degree-of-freedom system as below: 

x(t) = sin(a>jf + <p) t > 0 (4.105) 

where oij = oi n -J 1 — ? 2 . The analytic signal for this may be approximated as 

a x (t) = A x (t)e JM) kb (Ae-^"') e A^<+0-^/2) t > 0 (4.106) 

Since In A x (t) & In A — f a>„L the damping ratio f can be estimated from the plot of In A x (t) 
versus time, provided that the natural frequency oi n is known. This is demonstrated in MATLAB 
Example 4.4. However, as shown in MATLAB Example 4.4, it must be noted that A x (t) and 
4> x (t) are usually distorted, especially at the beginning and the last parts of the signal. This 
undesirable phenomenon occurs from the following: (i) the modulating component 
is not band-limited, (ii) the non-causal nature of the filter (h(t) = l/nt), and (iii) practical 
windowing effects (truncation in the frequency domain). Thus, the part of the signal near 
t = 0 must be avoided in the estimation of the damping characteristic. The windowing effect 
is discussed in the next section. 
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Example 2: Frequency modulation™ 5 

Now, to demonstrate another feature of the analytic signal, we consider the frequency modu- 
lated signal as given below: 


x(t) — A c cos (a> c t + A,„ sino>,„t) (4.107) 

This can be written as x(t) = A c [cos a> c t cos (A,„ sina>,„f) — sin a> c t sin(A,„ sin&j m t)] which 
consists of two amplitude-modulated signals, i.e. x(t) = m\(t) cos aj c t — m^t) sinay-t, where 
ni\{t) and ni 2 (t) may be approximated as band-limited (Oppenheim el al., 1999). So, for 
4®, <3C a > c , the analytic signal associated with Equation (4.107) may be approximated as 

a x (t) = A x (t)e mt) * A c e j(( °' ,+Am sin ' a "' ) (4. 108) 

and the corresponding A x (t), 4> x (t) and a> x (t) are as shown in Figure 4.49. Note that the 
instantaneous frequency is a> x (t) = d<f> x (t)/dt = a> c + oi m A m cos a> m t, as can be seen in Figure 
4.49(c). 


A (t) ~ A x(t) 



Figure 4.49 Analytic signal associated with the frequency-modulated signal 

From this example, we have seen that it may be possible to examine how the frequency 
contents of a signal vary with time by forming an analytic signal. We have seen two methods of 
relating the temporal and frequency structure of a signal. First, based on the Fourier transform 
we saw how group delay relates how groups of frequencies are delayed (shifted) in time, i.e. the 
group delays are time dependent. Second, we have seen a ‘non-Fourier’ type of representation 
of a signal as A(?) cos <p(t) (based on the analytic signal derived using the Hilbert transform). 
This uses the concepts of amplitude modulation and instantaneous phase and frequency. 

These two approaches are different and only under certain conditions do they give similar 
results (for signals with large bandwidth-time product - see the uncertainty principle in the 
next section). These considerations are fundamental to many of the time-frequency analyses 
of signals. Readers may find useful information on time-frequency methods in two review 
papers (Cohen, 1989; Hammond and White, 1996). 
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4.11 THE EFFECT OF DATA TRUNCATION (WINDOWING)™ 6 -* 9 

Suppose x(t) is a deterministic signal but is known only for —T /2 < t < T /2, as shown 
in Figure 4.50. 


x(t) 



Figure 4.50 Truncated data with a rectangular window w(t) 


In effect, we are observing the data through a window w(t) where 

w(t) =1 1 1 1 < T/2 

= 0 1 1 1 > T/2 


(4.109) 


so that we see the truncated data x T (t) = x(t)w(t). 

If we Fourier transform j tj-(f) (in an effort to get X(f)) we obtain the Fourier 
transform of the product of two signals x(t) and w(t) as (see Equation (4.45)) 

OO 

X T {f) = F{x(t)w{t)}= j X{g)W{f — g)dg = X(f) * W{f) (4.110) 

— OO 

i.e. the Fourier transform of the product of two time signals is the convolution 
of their Fourier transforms. W(f) is called the spectral window, and is W(f ) = 
T &va{nfT)/nfT for the rectangular window. Owing to this convolution operation in 
the frequency domain, the window (which need not be restricted to the rectangular data 
window) results in bias or truncation error. Recall the shape of W(f) for the rectangular 
window as in Figure 4.51. 


W(f) 



Figure 4.51 Fourier transform of the rectangular window w(t) 

The convolution integral indicates that the shape of X(g) is distorted, such that it 
broadens the true Fourier transform. The distortion due to the main lobe is sometimes 
called smearing, and the distortion caused by the side lobes is called leakage since the 
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frequency components of X(g) at values other than g = / Teak’ through the side lobes 
to contribute to the value of X T (f) at /. For example, consider a sinusoidal signal 
x(t) = cos(2jt pt) whose Fourier transform is X(f) = \ [8(f + p) + 8(f — p)]. Then 
the Fourier transform of the truncated signal Xj(t) is 

oo oo 

X T (/)= J X(g)W(f - g)dg = i J [8(g + p) + 8(g- p)]W(f - g)dg 

— oo — oo 

= [W(f + p)+w(f-p)\ (4.111) 

This shows that the delta functions (in the frequency domain) are replaced by the shape of 
the spectral window. The ‘theoretical’ and ‘achieved (windowed)’ spectra are illustrated 
in Figure 4.52 (compare X(f) and X T (f) for both shape and magnitude). 


X(f) 


1/2 


1/2. 


~P 

P 


(a) Theoretical 


X T {f) = X(f)*W(f) 



(b) Windowed 


Figure 4.52 Fourier transform of a cosine wave 


If two or more closely spaced sinusoidal components are present in a signal, then they 
may not easily be resolved in the frequency domain because of the distortion (especially 
due to the main lobe). A rough guide as to the effect of this rectangular window is obtained 
from Figure 4.53 (shown for / > 0 only). 


\X<J)\ 

x(t) is the sum of three sine (or cosine) waves 


h f A 



Considerable smearing due to 
the spectral window 


Three components are resolved but with 
considerable leakage at other frequencies 


Figure 4.53 Effects of windowing on the modulus of the Fourier transform 
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In fact, in order to get two separate peaks of frequencies f \ , fi given in this example 
it is necessary to use a data length T of order T > 2 /(/ 2 - /,) (i.e. h-f i > 2 /T) 
for the rectangular window. Note that the rectangular window is considered a ‘poor’ 
window with respect to the side lobes, i.e. the side lobes are large and decay slowly. The 
highest side lobe is 13 dB below the peak of the main lobe, and the asymptotic roll-off 
is 6 dB/octave. This results from the sharp corners of the rectangular window. However, 
the main lobe of the rectangular window is narrower than any other windows. 

MATLAB examples are given at the end of the chapter. Since we are using sinusoidal 
signals in MATLAB Examples 4.6 and 4.7, it is interesting to compare this windowing 
effect with the computational considerations for a periodic signal given in Section 3.6 
(and with MATLAB Example 3.2). 


A wide variety of windows are available, each with its own frequency characteristics. For 
example, by tapering the windows to zero, the side lobes can be reduced but the main lobe is 
wider than that of the rectangular window, i.e. increased smearing. To see this effect, consider 
the following two window functions: 

1 . A 20 % cosine tapered window (at each side. 10 % of the data record is tapered): 


|f| <47710 

-r/2 < t < -47710, 4T /10 <t<T/2 
\t\> T/2 

2. A Hann (Hanning) window (full cosine tapered window): 

w H (t) = cos 2 y |f | < T/2 
= 0 1 1 1 > T/2 


w c (t) = 1 

, 5nt 

= cos 

T 

= 0 


(4.112) 


(4.113) 


These window functions are sometimes called the Tukey window, and are shown in Figure 4.54. 
Note that the cosine tapered window has a narrower bandwidth and so better frequency res- 
olution whilst the Hann window has smaller side lobes and sharper roll-off, giving improved 
leakage suppression. 



Figure 4.54 Effect of tapering window 
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Window ‘carpentry’ is used to design windows to reduce leakage at the expense of 
main lobe width in Fourier transform calculations, i.e. to obtain windows with small side 
lobes. One ‘trades’ the side lobe reduction for ‘bandwidth", i.e. by tapering the window 
smoothly to zero, the side lobes are greatly reduced, but the price paid is a much wider 
main lobe. The frequency characteristic of a window is often presented in dB normalized 
to unity gain (0 dB) at zero frequency, e.g. as shown in Figure 4.55 for the rectangular 
window (in general, A — 1). 


w(t) = A[u(t + T/2) — u(t — T/2)] 
A 


-T/2 


T/2 

(a) 




The rectangular window may be good for separating closely spaced sinusoidal compo- 
nents, but the leakage is the price to pay. Some other commonly used windows and their 
spectral properties (for / > 0 only) are shown in Figure 4.56. The Flann window is a good 
general purpose window, and has a moderate frequency resolution and a good side lobe roll- 
off characteristic. Through MATLAB Examples 4. 6-4.9, the frequency characteristics of the 
rectangular window and the Flann window are compared. Another widely used window is the 
Flamming window (a Hann window sitting on a small rectangular base). It has a low level of 
the first few side lobes, and is used for speech signal processing. The frequency characteristics 
of these window functions are compared in Figure 4.57. 

We now note a few general comments on windows: 

1 . The ability to pick out peaks (resolvability) depends on the data widow width as well as 
the shape. 

2. The windows in Figure 4.56 (and others except the rectangular window) are not generally 
applicable to transient waveforms where a significant portion of the information is lost by 
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M‘) 


W(f) 




nt 


= 0 otherwise 

(a) Bartlett window (in general, A = 1) 

Mi) W(f ) 



t\<T/2 



W) =^r- 


sin(/r JT) 


Mi) 

1.0 


otherwise ^ ttfi) I (JT) J 

(b) Hann window (in general, A = 1) 

W(f ) 


-T/2 



T/2 


2m 



w(t) = 0.54 + 0.46 cos |/|<T/2 


= 0 


otherwise 


W(f) 


[0.54;r 2 -0.08(;r/T) 2 ]sin(;r JT) 

xfT\n 2 ~(7tfTf\ 


(c) Hamming window 



= 0 


otherwise 

(d) Parzen window 


Figure 4.56 Some commonly used windows 
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1 2 3 4 5 



Figure 4.57 Frequency characteristics of some windows 


windowing. M4,9 (The exponential window is sometimes used for exponentially decaying 
signals such as responses to impact hammer tests.) 

3. A correction factor (scaling factor) should be applied to the window functions to account 
for the loss of ‘energy’ relative to a rectangular window as follows: 

f—T/2 W )dt 

Scaling factor = (4.114) 

where w rec t(t) is the rectangular window, and w(t ) is the window function applied on the 
signal. For example, the scaling factor for the Flann window is ^/8/3. This correction 
factor is used in MATLAB Examples 4. 7-4. 9. This correction is more readily interpreted 
in relation to stationary random signals and will be commented upon again in that context 
with a more general formula for the estimation of the power spectral density. 

4. For the data windows, we define two ‘bandwidths’ of the windows, namely (a) 3 dB band- 
width. ; (b) noise bandwidth. The 3 dB bandwidth is the width of the power transmission 
characteristic at the 3 dB points, i.e. where there are 3 dB points below peak amplification, 
as shown in Figure 4.58. 

The (equivalent) noise bandwidth is the width of an ideal filter with the same peak 
power gain that accumulates the same power from a white noise source, as shown in 
Figure 4.59 (Harris, 1978). 

5. The properties of some commonly used windows are summarised in Table 4.2. More 
comprehensive discussions on window functions can be found in Harris (1978). 



Figure 4.58 The 3 dB bandwidth 
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Figure 4.59 Noise bandwidth 


Table 4.2 Properties of some window functions 


Window 

Highest 

Asymptotic 

3 dB 

Noise 

First zero 

(length T ) 

side lobe (dB) 

roll-off (dB/octave) 

bandwidth 

bandwidth 

crossing (freq.) 




i 

i 

i 

Rectangular 

-13.3 

6 

0.89- 

T 

1.00- 

T 

T 




1 

1 

2 

Bartlett (triangle) 

-26.5 

12 

1 .28 — 
T 

1.33 — 
T 

T 




1 

1 

2 

Hann(ing) (Tukey 

-31.5 

18 

1.44- 

1.50- 


or cosine squared) 






Hamming 

-43 

6 

1.30- 

T 

1.36- 

T 

2 

T 

Parzen 

-53 

24 

1.82- 

1.92 — 

4 




T 

T 

T 


The Uncertainty Principle (Bandwidth-Time Product) 

As can be seen from the Fourier transform of a rectangular pulse (see Figure 4.8), i.e. Equation 
(4.27), X(f) = lab sm(2n fb)/2nfb, a property of the Fourier transform of a signal is that the 
narrower the signal description in one domain, the wider its description in the other. An extreme 
example is a delta function S(t) whose Fourier transform is a constant. Another example is 
a sinusoidal function cos(27r/oO whose Fourier transform is \[&(f — /o) + S(f + /o)]. This 
fundamental property of signals is generalized by the so-called uncertainty principle. 

Similar to Heisenberg’s uncertainty principle in quantum mechanics, the uncertainty 
principle in Fourier analysis is that the product of the spectral bandwidth and the time duration 
of a signal must be greater than a certain value. Consider a signal x(t) with finite energy, such 
that \\x || 2 = x 2 (t)dt < oo, and its Fourier transform X(a>). We define the following: 


OO 


x f tx 2 (t)dt 

IUII 2 J 

— oo 

(4.115a) 

oo 

—^3 f (t - t) 2 x 2 (t)dt 

lull 2 J 

(4.115b) 


— OO 
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where 1 is the centre of gravity of the area defined by x 2 (t), i.e. the measure of location, and 
the time dispersion At is the measure of the spread of x(t). Similarly, on the frequency scale, 
Ill'll 2 = \X(a>)\ 2 dm, and we define 


(Acn ) 2 = 


OO 

hf 

— OO 

oc 

yi 


ft) = / a) I X(m)r dm 

IIXII 2 ' 

— OO 


(ft) — ft)) 2 \X(m)\ 2 dm 


(4.116a) 


(4.116b) 


where m is the measure of location on the frequency scale, and Am is called the spectral 
bandwidth , which is the measure of spread of X(m). Note that for a real signal x(t), m is equal 
to zero since \X(m)\ 2 is even. Using Schwartz’s inequality 


OO 

n 

OO 

n 

OO 

/* 

J \f(t)\ 2 dt 

) \g(f)\ 2 dt> 

J f(t)g(t)dt 

-OO 

— OO 

-OO 


and Parseval’s theorem, it can be shown that (Hsu, 1970) 


1 

Am- At > - (4.118) 

“ 2 

or, if the spectral bandwidth is defined in hertz, 

1 

Af-At> — (4.119) 

47T 


Thus, the bandwidth-time ( BT) product of a signal has a lower bound of 1/2 . For 
example, the BT product of the rectangular window is Am- At = 2 n (or Af-At = 1), 
and the Gaussian pulse e~ at has the ‘minimum BT product’ of Am- At =1/2 (recall 
that the Fourier transform of a Gaussian pulse is another Gaussian pulse, see Equation 
(4.33)). For the proof of these results, see Hsu (1970). 


The inequality above points out a difficulty (or a limitation) in the Fourier-based time- 
frequency analysis methods. That is, if we want to obtain a ‘local’ Fourier transform then 
increasing the ‘localization" in the time domain results in poorer resolution in the frequency 
domain, and vice versa. In other words, we cannot achieve arbitrarily fine ‘resolution’ in both 
the time and frequency domains at the same time. 

Sometimes, the concept of the above inverse spreading property can be very useful to 
understand principles of noise control. For example, when the impact between two solid 
bodies produces a significant noise, the most immediate remedy may be to increase the impact 
duration by adding some resilient material. This increase of time results in narrower frequency 
bandwidth, i.e. removes the high-frequency noise, and reduces the total noise level. This is 
illustrated in Figure 4.60 assuming that the force is a half-sine pulse. Note that the impulse 
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Figure 4.60 Interpretation of impact noise 

(the area under the force curve, X;(t)) is the same for both cases, i.e. 

Ti T 2 

j x\ (t)dt = j xz (t)dt 
0 0 

However, the total energy of the second impulse is much smaller, i.e. 


OO OO 

J \Xi(f)\ 2 df » J \x 2 (f)\ 2 df 


as shown in Figure 4.60(b). Also note that, for each case, Parseval’s theorem is satisfied, i.e. 

Ti 


Jx 2 (t)dt= f \Xi(f)\ 2 df 


4.12 BRIEF SUMMARY 

1. A deterministic aperiodic signal may be expressed by 
x{t) = 


- / and *(/)-/ : Fourier iu.egnd pair 

— OO — OO 

2. Then, the energy spectral density of x(t) is \X(f)\ 2 and satisfies 

OO OO 

J x 2 (t)dt = / \X(f)\ 2 df : Parse.al's theorem 
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3. The input-output relationship for an LTI system is expressed by the convolution 
integral, 



LTI system, 



h{t) 



i.e. y(t) = h(t) * x(t) = h(r)x(t — x)dr, and in the frequency domain Y(f) = 

4. A pure delay preserves the shape of the original shape, and gives a constant value 
of group delay— dcp/dco = to- A non-constant group delay indicates the degree of 
nonlinearity of the phase. 

5. A minimum phase system has all its poles and zeros in the left half of the s-plane, and 
is especially useful for inverse problems. 

6. The analytic signal a x (t) = provides the concepts of instantaneous ampli- 

tude, instantaneous phase and instantaneous frequency. 

7. If a signal is truncated such that Xj -(f) = x(t)w(t), then X T (f ) = 
IZc X (s)W(f- 8 )dg. 

8. Data windows w{t) introduce ‘leakage’ and distort the Fourier transform. Both the 
width and shape of the window dictate the resolvability of closely spaced frequency 
components. A ‘scale factor’ should be employed when a window is used. 

9. The uncertainty principle states that the product of the spectral bandwidth and the time 
extent of a signal is Aco-At > 1/2. This indicates the fundamental limitations of the 
Fourier-based analyses. 


4.13 MATLAB EXAMPLES 


Example 4.1: The effect of an echo 

Consider a signal with a pure echo, y(t) = x(t) + ax(t — to) as given in Equation (4.46), 
where the main signal is x(t) = e~ k ^^ (see Equation (4.20) and Figure 4.5). For this 
example, the parameters a = 0.2, X = 300 and to = 0.15 are chosen. Readers may change 
these values to examine the effects for various cases. 


Line MATLAB code Comments 


1 clear all Define time variable from —5 to 5 seconds 

2 fs=500; t=-5:l/fs:5; with sampling rate fs = 500. 

3 lambda=300; t0=0.15; a=0.2; Assign values for the parameters of the 

signal. 


4 


x=exp(-lambda*abs(t)); 


Expression of the main signal, *(?)• This is 
for the comparison with y(t). 
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5 y=x+a*exp(-lambda*abs(t-tO)); 

6 X=fft(x); Y=fft(y); 


7 N=length(x); 

8 fp=0:fs/N:fs/2; % for the positive frequency 

9 fn=-fs/N:-fs/N:-fs/2; 

% for the negative frequency 

10 f=[fliplr(fn) fp]; 

1 1 plot(f,fftshift(abs(X)/fs), 'r:') 

12 xlabel('Frequency (Hz)'); ylabel('Modulus') 

13 hold on 


14 plot(f,fftshift(abs(Y)/fs)) 

15 hold off 


Expression of the signal, y(t). 

Fourier transforms of signals x(t) and y(t). 
In fact, this is the discrete Fourier transform 
(DFT) which will be discussed in Chapter 6. 

Define the frequency variables for both 
positive and negative frequencies. (The 
frequency spacing of the DFT will also be 
discussed in Chapter 6.) The command 
‘fliplr’ flips the vector (or matrix) in the 
left/right direction. 

Plot the magnitude of X(f), i.e. \X(f)\ 
(dashed line) 2 , and hold the graph. The 
command ‘fftshift’ shifts the zero frequency 
component to the middle of the spectrum. 
Note that the magnitude is scaled by ‘1/fs’, 
and the reason for doing this will also be 
found in Chapter 6. 

Plot the magnitude \Y(f)\ on the same 
graph, and release the graph. Compare this 
with |X(/)|. 


Results 



Example 4.2: Appearances of envelope and carrier signals 

This is examined for the cases of t p = t g , t p < t g and t p > t g in Equation (4.75), i.e. 

sin (Bit - fe)) 

x(t) = 2 AB — - 1 cos co k (t - t p ) 

7zB(t — t g ) 

envelope carrier 


2 It is dotted line in the MATLAB code. However, dashed lines are used for generating figures. So, the dashed line in 
the comments denotes the ‘dotted line’ in the corresponding MATLAB code. This applies to all MATLAB examples 
in this book. 
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Line 

MATLAB code 

Comments 

i 

clear all 

Define the frequency band in rad/s. 

2 

B=l; 


3 

A=3; 

Select the amplitude A arbitrary, and define 

4 

wk=6; 

the earner frequency, wk such that wk B. 

5 

tg=5; 

Define the group delay tg, and the phase 

6 

tp=5; % tp=4.7 (for tp < tg), 

delay tp. 


% tp=5.3 (for tp > tg) 

In this example, we use tp=5 for tp = tg, 
tp=4.7 for tp < tg, and tp=5.3 for tp > tg. Try 
with different values. 

7 

t=0:0.03: 10; 

Define the time variable. 

8 

x=2*A*B*sin(B*(t-tg))./(pi*B* 

Expression of the above equation. This is the 


(t-tg)).*cos(wk*(t-tp)); 

actual time signal. 

9 

xe=2*A*B*sin(B*(t-tg))./(pi*B*(t-tg)); 

Expression of the ‘envelope’ signal. 

10 

plot(t,x); xlabel('Time (s)'); 

Plot the actual amplitude-modulated signal, 


ylabel('\itx\rm(\itt\rm)') 

and hold the graph. 

11 

hold on 


12 

plot(t, xe, 'g:', t, -xe, *g:') 

Plot the envelope signal with the dashed line, 

13 

hold off 

and release the graph. 

14 

grid on 



Results 



Time (s) 


(a) 



(b) (c) 
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Example 4.3: Hilbert transform: amplitude-modulated signal (see Equation (4.103)) 
x(t) = (A c + A,„ sina>,„t) cos co c t = (A t . + A,„ sin27r f m t) cos 2nf c t 


For this example, the parameters A c = 1, A m = 0.5, /,„ = 1 and f c = 10 are chosen. 


Line 

MATLAB code 

Comments 

i 

clear all 

Define parameters and the time 
variable. 

2 

Ac=l; Am=0.5; fm=l; fc=10; 


3 

1=0:0.001:3; 


4 

x=(Ac+Am*cos(2*pi*fm*t)).*cos(2*pi*fc*t); 

Expression of the amplitude-modulated 
signal, x(t). 

5 

a=hilbert(x); 

Create the analytic signal. Note that, in 
MATLAB, the function ‘hilbert’ 
creates the analytic signal, not x(t). 

6 

fx=diff(unwrap(angle(a)))./diff(t)/(2*pi); 

This is an approximate derivative, 
which computes the instantaneous 
frequency in Hz. 

7 

figure(l) 

Plot the instantaneous amplitude 

8 

plot(t, abs(a), t, x, 'g:') 

AAt). 

9 

axis([0 3 -2 2]) 

Note that A t (t) estimates well the 

10 

xlabel('Time (s)'); ylabel('\itA_x\rm(\itt\rm)') 

envelope of the signal, 
A c + A m sin 2 7zf„t = 
1 + 0.5 sin27T • 1 • t. 

11 

figure(2) 

Plot the instantaneous (unwrapped) 

12 

plot(t, unwrap(angle(a))) 

phase 0 x (f), which increases linearly 

13 

axis([0 3 0 200]) 

with time. 

14 

xlabel('Time (s)'); ylabel('\it\phi_x\rm(\itt\rm)') 


15 

figure(3) 

Plot the instantaneous frequency, 

16 

plot(t(2:end),fx) 

where f x (t) = (o x (t)/2n. 

17 

axis([0 3 8 12]) 

Note that f x (t) estimates f c = 10 

18 

xlabeK'Time (s)'); ylabel('\itf_x\rm(\itt\rm)') 

reasonably well, except small regions 
at the beginning and end. 


Results 
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0>) (c) 


Example 4.4: Hilbert transform: estimation of damping coefficient (see Equation 
(4.106)) 

Suppose we have a signal represented as Equation (4. 105), i.e. 

x{t) = sinleurff + cj>) = Ae s i n ( W;; j _|_ 0 ) 

and, for this example, the parameters A = 1, J = 0.01, /„ = 10 and (j> = 0 are chosen. 


Line 

MATLAB code 

Comments 

i 

clear all 

Define parameters and the time variable. 

2 

A=l; zeta=0.01; fn=10; wn=2*pi*fn; 


3 

wd=wn*sqrt(l-zeta~2); phi=0; t=0:0.001:6; 


4 

x=A*exp(-zeta*wn*t).*sin(wd*t4-phi); 

Expression of the signal (Equation 
(4.105)). 

5 

a=hilbert(x); 

Create the analytic signal. 

6 

ax=log(abs(a)); 

Compute In A x (t). Note that ‘log’ in 
MATLAB denotes the natural logarithm. 

7 

figure) 1) 

Plot the instantaneous amplitude A x (t). 

8 

plot(t, abs(a), t, x, 'g:'); axis([0 6-1.5 1.5]) 

Note that, in this figure (Figure (a) below), 

9 

xlabel('Time (s)'); 

the windowing effect (truncation in the 


ylabel('\itA_x\rm(\itt\rm)') 

frequency domain - MATLAB uses the 
FFT-based algorithm, see MATLAB help 
window for details) and the non-causal 
component are clearly visible. 

10 

figure(2) 

Plot In Af(f) versus time. The figure shows 

11 

plot(t, ax); axis([0 6 -6 1]) 

a linearly decaying characteristic over the 

12 

xlabelfTime (s)'); 

range where the windowing effects are not 


ylabel('ln\itA_x\rm(\itt\rm)') 

significant. 
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13 p=polyfit(t(1000:4000), ax(1000:4000), 1); ‘polyfit’ finds the coefficients of a 

polynomial that fits the data in the least 
squares sense. In this example, we use a 
polynomial of degree 1 (i.e. linear 
regression). Also, we use the data set in the 
well-defined region only (i.e. 1 to 
4 seconds). 

14 format long ‘format long’ displays the number with 15 

15 zeta_est=-p(l)/wn digits. 

The first element of the vector p represents 
the slope of the graph in Figure (b) below. 
Thus, the f can be estimated by dividing 
— p(l) by the natural frequency co„. 


Results 



The variable ‘zeta_est’ returns the value ‘0.00999984523039’ which is very close to the 
true value f = 0 . 01 . 


Example 4.5: Hilbert transform: frequency-modulated signal (see Equation (4.107)) 
x(t) — A c cos (co c t + A m sin a> m t) = A c cos (2nf c t + A m sin 2nf m t) 

For this example, the parameters A c = 1, A m = 4, /„, = 1 and f c = 8 are chosen. 


Line MATLAB code Comments 


1 clear all Note that we define a much finer time 

2 Ac=l; Am=4; fm=l; fc=8; variable for a better approximation of 

3 t=0:0.0001 :4; the derivative (see Line 6 of the 

MATLAB code). 

4 x=Ac*cos(2*pi*fc*t + Am*sin(2*pi*fm*t)); Expression of the 

frequency-modulated signal, x(t). 
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5 a=hilbert(x); Create the analytic signal. 

6 fx=diff(unwrap(angle(a)))./diff(t)/(2*pi); Compute the instantaneous frequency 

in Hz. 


7 figure(l) 

8 plot(t, abs(a), t, x, 'g:'); axis([0 4-1.5 1.5]) 

9 xlabeK'Time (s)'); ylabel('\itA_x\rm(\itt\rm)') 


10 figure(2) 

1 1 plot(t, unwrap(angle(a))); axis([0 4 0 220]) 

12 xlabelfTime (s)'); 
ylabel('\it\phi_x\rm(\itt\rm)') 

13 figure(3) 

14 plot(t(2:end),fx); axis([0 4 0 13]) 

15 xlabel('Time (s)'); ylabel('\itf_x\rm(\itt\rm)') 


Plot the instantaneous amplitude 

AAt). 

Note that the envelope is 
A x (t)^A c = l. 

Plot the instantaneous (unwrapped) 
phase 0,(f). 


Plot the instantaneous frequency, 
where f x (t) = <o x (t)/ 2tt. 

Note that f x (t) = f c +f m A m 
cos 2 jzf m t = 8 + 4 cos In • 1 • t. 


Results 




Example 4.6: Effects of windowing on the modulus of the Fourier transform 

Case 1 : Rectangular window (data truncation) 

Consider the following signal with three sinusoidal components: 

x{t) = A\ sin27r/ir + A 2 s\n2nf2t + A 3 sinlnfct 
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Amplitudes are A\ = A 2 = A 3 = 2, which gives the magnitude ‘1’ for each sinusoidal 
component in the frequency domain. The frequencies are chosen as f\ = 10, fa = 20 
and fi — 21 . 


Line MATLAB code Comments 


1 clear all 

2 fl = 10; f2=20; f3=21; fs=60; 

3 T=0.6; % try different values: 0.6, 0.8, 1.0, 
1.5,2, 2.5, 3,4 


4 t=0:l/fs:T-l/fs; 


5 x=2*sin(2*pi*fl*t) + 
2*sin(2*pi*f2*t)+2*sin(2*pi*f3*t); 

6 N=length(x); 

7 X=fft(x); 

8 f=fs*(0:N-l)/N; 

9 Xz=fft([x zeros(l,2000-N)]); %zero padding 

10 Nz=length(Xz); 

1 1 fz=fs*(0:Nz- 1 )/Nz; 


12 figure(l) 

13 stem(f(l:N/2+l), abs(X(l:N/2+l)/fs/T), 'r:') 

14 axis([0 30 0 1.2]) 

15 xlabel('Frequency (Hz)'); ylabelCModulus 1 ) 

16 hold on 

17 plot(fz(l:Nz/2+l), abs(Xz(l:Nz/2+l)/fs/T)) 

1 8 hold off; grid on 


Define frequencies. The sampling rate is 
chosen as 60 Hz. 

Define the window length 0.6 s. In this 
example, we use various lengths to 
demonstrate the effect of windowing. 

Define time variable from 0 to T-l/fs 
seconds. The subtraction by 1/fs is 
introduced in order to make ‘exact’ periods 
of the sinusoids (see Chapter 6 for more 
details of DFT properties). 

Description of the above equation. 

Perform DFT using the ‘fft’ function of 
MATLAB. Calculate the frequency variable 
(see Chapter 6). 

Perform ‘2000-point’ DFT by adding zeros 
at the end of the time sequence ‘x’ . This 
procedure is called the ‘zero padding’ (see 
the comments below). Calculate new 
frequency variable accordingly. 

Plot the modulus of the DFT (from 0 to 
fs/2 Hz). Note that the DFT coefficients are 
divided by the sampling rate fs in order to 
make its amplitude the same as the Fourier 
integral (see Chapter 6). Also note that, 
since the time signal is periodic , it is further 
divided by ‘T’ in order to compensate for its 
amplitude, and to make it same as the 
Fourier series coefficients (see Chapter 6 
and Chapter 3, Equation (3.45)). 

The DFT without zero padding is drawn as 
the dashed stem lines with circles, and the 
DFT with zero padding is drawn as a solid 
line. Two graphs are drawn in the same 
figure. 


Comments: 

1. Windowing with the rectangular window is just the truncation of the signal (i.e. from 
0 to T seconds). The results are shown next together with MATLAB Example 4.7. 

2. Zero padding: Padding ‘zeros’ at the end of the time sequence improves the appear- 
ance in the frequency domain since the spacing between frequencies is reduced. In 
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other words, zero padding in the time domain results in interpolation in the frequency 
domain (Smith, 2003). Sometimes this procedure is called ‘spectral interpolation’. 
As a result, the appearance in the frequency domain (DFT) resembles the true spec- 
trum (Fourier integral), thus it is useful for demonstration purposes. However, it does 
not increase the ‘true" resolution, i.e. does not improve the ability to distinguish the 
closely spaced frequencies. Note that the actual resolvability in the frequency do- 
main depends on the data length T and the window type. Another reason for zero 
padding is to make the number of sequence a power of two to meet the FFT algo- 
rithm. However, this is no longer necessary in many cases such as programming in 
MATLAB. 

Since zero padding may give a wrong impression of the results, it is not used in this 
book except for some demonstration and special purposes. 


Example 4.7: Effects of windowing on the modulus of the Fourier transform 

Case 2: Hann window 

In this example, we use the same signal as in the previous example. 


Line 

MATLAB code 

Comments 

i 

clear all 

Same as in the previous example. 

2 

f 1= 10; f2=20; f3=21; fs=60; 


3 

T=0.6; 

% try different values: 0.6, 0.8, 1.0. 1.5, 2, 2.5, 3, 4 


4 

t=0: l/fs:T 1/fs; 


5 

x=2*sin(2*pi*f 1 *t)4- 2*sin(2*pi*f2*t)4- 
2*sin(2*pi*f3*t); 


6 

N=length(x); 


7 

whan=hanning(N) ; 

Generate the Hann window with 

8 

x=x.*whan'; 

the same size of vector as x, and 

9 

X=fft(x); 

multiply by x. Then, perform the 

10 

f=fs*(0:N-l)/N; 

DFT of the windowed signal. 

11 

Xz=fft([x zeros(l,2000-N)]); % zero padding 

Same as in the previous example. 

12 

Nz=length(Xz); 


13 

fz=fs*(0:Nz- 1 )/Nz; 


14 

figured) 

Same as in the previous example, 

15 

stem(f(l:N/2+l), sqrt(8/3)*abs(X(l:N/2+l)/fs/T), 'r:') 

except that the magnitude 

16 

axis([0 30 0 1.2]) 

spectrum is multiplied by the 

17 

xlabel('Frequency (Hz)'); ylabel('Modulus') 

scale factor ‘sqrt(8/3)’ (see 

18 

hold on 

Equation (4.114)). 

19 

plot(fz(l :Nz/2+ 1 ), sqrt(8/3)*abs(Xz( 1 :Nz/2+ l)/fs/T)) 


20 

hold off; grid on 
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Results of Examples 4.6 and 4.7 
Rectangular window Hann window 



Frequency (Hz) 
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Frequency (Hz) 
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Comments: 

1. The 10 Hz component is included as a reference, i.e. for the purpose of comparison 
with the other two peaks. 

2. The solid line (DFT with zero padding) is mainly for demonstration purposes, and 
the dashed stem line with circles is the actual DFT of the windowed sequence. From 
the results of the DFT (without zero padding), it is shown that the two sinusoidal 
components (20 Hz and 21 Hz) are separated after T = 2 for the case of a rectangular 
window. On the other hand, they are not completely separable until T = 4 if the Hann 
window is used. This is because of its wider main lobe. However, we note that the 
leakage is greatly reduced by the Hann window. 

3. For the case of the Hann window, the magnitudes of peaks are underestimated even if 
the scale factor is used. (Note that the main lobe contains more frequency lines than 
in the rectangular window.) 

4. However, for the case of the rectangular window, the peaks are estimated correctly 
when the data length corresponds to exact periods of the signal, i.e. when T = 1. 2, 3 
and 4. Note that the peak frequencies are located precisely in this case (see the 21 Hz 
component). Compare this with the other cases (non-integer T) and with MATLAB 
Example 3.2 in Chapter 3. 


Example 4.8: Comparison between the rectangular window and the Hann window: 
side roll-off characteristics 


Consider the signal x(t) = A\ sin (2nf\t) + Ai sin [2nf2t), where Ai A 2 - In this ex- 
ample, we use Ai = 1, A 2 = 0.001, f\ = 9, /2 = 14, and the data (window) length ‘T = 
15.6 seconds’. 


Line 

MATLAB code 

Comments 

1 

clear all 

Define parameters and the time 

2 

f 1=9; f2= 14; fs=50; T=15.6; 

variable. ‘T=15.6’ is chosen to 

3 

t=0:l/fs:T-l/fs; 

introduce some windowing effect. 
The sampling rate is chosen as 50 Hz. 

4 

x=l*sin(2*pi*fl*t) + 0.001*sin(2*pi*f2*t); 

Expression of the above equation. 

5 

N=length(x); 

Create the Hann windowed signal xh, 

6 

whan=hanning(N); xh=x.*whan'; 

and then perform the DFT of both x 

7 

X=fft(x); Xh=fft(xh); 

and xh. Also, calculate the frequency 

8 

f=fs*(0:N-l)/N; 

variable. 

9 

figure(l) 

Plot the results: solid line for the 

10 

plot(f( 1 :N/2+ 1 ), 20*logl0(abs(X( 1 :N/2+ 1 )/fs/T))); 

rectangular window, and the dashed 


hold on 

line for the Hann window. 

11 

plot(f(l:N/2+l), 20*logl0(sqrt(8/3)* 
abs(Xh( 1 :N/2+ 1 )/fs/T)),'r:') 


12 

axis([0 25 -180 0]) 


13 

xlabel('Frequency (Hz)'); ylabel('Modulus (dB)') 


14 

hold off 
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Results 



Comments: The second frequency component is hardly noticeable with the rectangular 
window owing to the windowing effect. But, using the Hann window, it becomes pos- 
sible to see even a very small amplitude component, due to its good side lobe roll-off 
characteristic. 


Example 4.9: Comparison between the rectangular window and the Hann window 
for a transient signal 

Case 1: Response of a single-degree-of-freedom system 
Consider the free response of a single-degree-of-freedom system 

x(t) = — sin(avt) and F{x(t)} = — 5 

aid u> A n - or + ;2fa>„o> 

where aid = co ny / 1 — f 2 . In this example, we use A = 200, f = 0.01, a>„ = 2 jcf„ = 
2tt(20). 


Line 


MATLAB code 


Comments 


1 

2 

3 


4 

5 

6 

7 

8 


clear all 

fs=100; t=[0: l/fs:5- 1/fs] ; 

A=200; zeta=0.01; wn=2*pi*20; 
wd=sqrt( 1 -zeta~2)* wn; 

x=(AAvd)*exp(-zeta*wn*t).*sin(wd*t); 

N=length(x); 

whan=hanning(N); xh=x.*whan'; 
X=fft(x); Xh=fft(xh); 
f=fs*(0:N-l)/N; 


The sampling rate is chosen as 
100 Hz. The time variable and 
other parameters are defined. 


Expression of the time signal. 

Create the Hann windowed 
signal xh, and then perform 
the DFT of both x and xh. 
Also, calculate the frequency 
variable. 


H=A./(wn~2 - (2*pi*f)72 + i*2*zeta*wn*(2*pi*f)); Expression of the true Fourier 

transform, F{x(t)}. 


9 
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10 

figure(l) 

Plot the results in dB scale: 

11 

plotfff 1 :N/2+ 1 ), 20*logl0(abs(X( 1 :N/2+ 1 )/fs))); 

Solid line (upper) for the 


hold on 

rectangular window, solid line 

12 

plot(f( 1 :N/2+ 1 ), 20*logl0(sqrt(8/3)* 

(lower) for the Hann window, 


abs(Xh(l:N/2+l)/fs)), V) 

and dashed line for the ‘true’ 

13 

plot(f(l:N/2+l), 20*log 1 0(abs(H( 1 :N/2+ 1 ))), 'g:') 

Fourier transform. 

14 

axis([0 50 -150 0]) 


15 

xlabel('Frequency (Hz)'); ylabel('Modulus (dB)') 


16 

hold off 


17 

figure(2) 

Plot the results in linear scale: 

18 

plot(f(l:N/2+l), abs(X( 1 :N/2+ 1 )/fs)) ; hold on 

underestimation of the 

19 

plot(f(l:N/2+l), (sqrt(8/3)*abs(Xh( 1 :N/2+ 1 )/fs)), V) 

magnitude spectrum by the 

20 

plot(f(l:N/2+l), abs(H( 1 :N/2+l)), 'g:') 

Hann window is more clearly 

21 

axis([0 50 0 0.7]) 

seen. 

22 

xlabel('Frequency (Hz)'); 
ylabel('Modulus (linear scale)') 


23 

hold off 



Results 



Comments: Note that the magnitude spectrum is considerably underestimated if the 
Hann window is used, because a significant amount of energy is lost by windowing. 
Thus, in general, windowing is not applicable to transient signals. 


Case 2: Response of a two-degree-of-freedom system, when the contributions of two 
modes are considerably different. This example is similar to MATLAB Example 4.8. 
Consider the free response of a two-degree-of-freedom system, e.g. 


A B 

x{t) — g-fi _| g-ftata t s j n (fi) d 2 t) 

a >dl a>d2 


Then, its Fourier transform is 


F{x{t)} = 


ar nl - 0)2 + j^ia> n iO) a>l 2 -a > 2 + j2^ 2 co n 2co 
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In this example, we use A = 200, B — 0.001A,(i = ( 2 = 0.01, a>„i = 2jr(20)anda>„2 = 
27r (30). Note that A^> B. 


Line 

MATLAB code 

Comments 

i 

clear all 

Same as Case 1, except that the 

2 

fs=100; t= [0: 1/fs :5- 1/fs] : 

parameters for the second mode 

3 

A=200; B=0.001*A; zetal=0.01; zeta2=0.01; 

are also defined. 

4 

wnl=2*pi*20; wdl=sqrt(l-zetal~2)*wnl; 


5 

wn2=2*pi*30; wd2=sqrt(l-zeta2~2)*wn2; 


6 

x=(AAvdl)*exp(-zetal*wnl*t).*sin(wdl*t) + 
(B/wd2)*exp(-zeta2*wn2*t).*sin(wd2*t); 

Expression of the time signal, x{t). 

7 

N=length(x); 

Same as Case 1. 

8 

whan=hanning(N); xh=x.*whan'; 


9 

X=ffi(x); Xh=fft(xh); 


10 

f=fs*(0:N-l)/N; 


11 

H=A./(wnU2-(2*pi*f)72+i*2*zetal*wnl*(2*pi*f)) 

Expression of the true Fourier 


+ B./(wn2“2-(2*pi*f).“2+i*2*zeta2*wn2*(2*pi*f)); 

transform, F{x(f)}. 

12 

figure) 1) 

Plot the results of the rectangular 

13 

plot(f(l:N/2+l), 20*logl0(abs(X( 1 :N/2+ 1 )/fs))); 

window in dB scale: solid line for 


hold on 

the rectangular window and 

14 

plot(f(l:N/2+l), 20*logl0(abs(H( 1 :N/2+ 1 ))), 'g:') 

dashed line for the 'true' Fourier 

15 

axis([0 50 -60 0]) 

transform. 

16 

xlabel('Frequency (Hz)'); ylabel('Modulus (dB)') 


17 

hold off 


18 

figure(2) 

Plot the results of the Hann 

19 

plot(f( 1 :N/2+ 1 ), 20*log 1 0(sqrt(8/3)* 

window in dB scale: solid line for 


abs(Xh( 1 :N/2+ 1 )/fs))) 

the Hann window, and dashed line 

20 

hold on 

for the ‘true’ Fourier transform. 

21 

plot(f(l:N/2+l), 20*log 1 0(abs(H( 1 :N/2+ 1 ))), 'g:') 


22 

axis([0 50-160 0]) 


23 

xlabel('Frequency (Hz)'); ylabel('Modulus (dB)') 


24 

hold off 



Results 
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Comments: Similar to MATLAB Example 4.8, the second mode is clearly noticeable 
when the Hann window is used, although the magnitude spectrum is greatly under- 
estimated. Note that the second mode is almost negligible, i.e. B A. So, it is almost 
impossible to see the second mode in the true magnitude spectrum and even in the phase 
spectrum as shown in Figure (c). 


True phase spectrum 
\ 

L - 

. 2nd mode 

.y 

i i i i i 

1 1 1 


5 1 , i . . 1 1 1 1 1 

0 5 10 15 20 25 30 35 40 45 50 

Frequency (Hz) 

(c) Phase spectrum 


The reason for these results is not as clear as in MATLAB Example 4.8 where the two 
sinusoids are compared. However, it might be argued that the convolution operation in 
the frequency domain results in magnifying (or sharpening) the resonance region owing 
to the frequency characteristic of the Hann window. 
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Time Sampling and Aliasing 


Introduction 

So far, we have developed the Fourier transform of a continuous signal. However, we usually 
utilize a digital computer to perform the transform. Thus, it is necessary to re-examine Fourier 
methods so as to be able to transform sampled data. We would ‘hope’ that the discrete version 
of the Fourier transform resembles (or approximates) the Fourier integral (Equation (4.6)), 
such that it represents the frequency characteristic (within the range of interest) of the original 
signal. In fact, from the MATLAB examples given in the previous chapter, we have already 
seen that the results of the discrete version (DFT) and the continuous version (Fourier integral) 
appear to be not very different. However, there are fundamental differences between these two 
versions, and in this chapter we shall consider the effect of sampling, and relate the Fourier 
transform of a continuous signal and the transform of a discrete signal (or a sequence). 


5.1 THE FOURIER TRANSFORM OF AN IDEAL 
SAMPLED SIGNAL 

Impulse Train Modulation 

We introduce the Fourier transform of a sequence by using the mathematical notion of 
‘ideal sampling’ of a continuous signal. Consider a ‘train’ of delta functions i(t) which 
is expressed as 

OO 

i(t)= J2 s (t~n A) (5.1) 


Fundamentals of Signal Processing for Sound and Vibration Engineers 
K. Shin and J. K. Hammond. © 2008 John Wiley & Sons, Ltd 


120 


TIME SAMPLING AND ALIASING 


i.e. delta functions located every A seconds as depicted in Figure 5.1. 


At) 


1 1 

1.0 

L 

: 1 

LU 






Figure 5.1 Train of delta functions 


Starting with a continuous signal x{t), an ideal uniformly sampled discrete sequence 
is x(n A) = x(f)| I= „A evaluated every A seconds of the continuous signal x(t). Since the 
sequence x(n A) is discrete, we cannot apply the Fourier integral. Instead, the ideally 
sampled signal is often modelled mathematically as the product of the continuous sig- 
nal x(t) with the train of delta functions i(t), i.e. the sampled signal can be written 
as 

x„(t) = x(t)i(t) (5.2) 

The reciprocal of the sampling interval, /„ = 1/A, is called the sampling rate, which 
is the number of samples per second. The sampling procedure can be illustrated as in 
Figure 5.2. 


x(t) i(t ) x s (t) 



Figure 5.2 Impulse train representation of a sampled signal 


In this way we see that x s (t) is an amplitude-modulated train of delta functions. We 
also note that x s (t) is not the same as x(n A) since it involves delta functions. However, 
it is a convenient step to help us form the Fourier transform of the sequence x{n A), as 
follows. Let X s {f) denote the Fourier transform of the sampled signal x s (t). Then, using 
properties of the delta function, 


oo 

r 

oo 

oo 

oo 

r 

x s (f)= j 

— oo 

x(t) S(t — nA ) 

n= — oo 

e -i 2 *f , dt= 

n=— oo 

I x(t)e~^ 2w ^ ■ S(t — n A)dt 

—oo _ 


= x(nA)e- jl * fnA (5.3) 

n =— oo 


The summation (5.3) now involves the sequence x(n A) and is (in principle) computable. It 
is this expression that defines the Fourier transform of a sequence. We are now in a position 
to note some fundamental differences between the Fourier transform X{f) of the original 
continuous signal x(t) and X s (f), the Fourier transform of the uniformly sampled version 
x(n A). 
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Note that Equation (5.3) implies that X s (f) has a periodic structure in frequency with 
period 1/A. For example, for an integer number r, X s (f + r/A) becomes 

oo oo 

XAf + r/A) = E x(nA)e- J2jl(f+r/A)nA = E x(nA)e - ,2lrnl 

n =— oo n =— oo 

oo 

= E x(nA)e j2 ' T/nA = x ' ( f> (5 - 4) 

n =— oo 


This periodicity in frequency will be discussed further shortly. The inverse Fourier transform 
of X s (f ) can be found by multiplying both sides of Equation (5.3) by gf 2lr / rA and integrating 
with respect to/from — 1/2A to 1/2A (since X s (f) is periodic, we need to integrate over 
only one period), and taking account of the orthogonality of the exponential function. Then 


1/2 A 

I X s (f)e j2 * frA df 

-1/2 A 


1/2 A 

/ 


-1/2A 


oo 

J2 jt(tiA)e _j2 ’i" A 

n = — oo 


e j2nfrA d f 


1/2 A 


- E 

n=—oo 

oo 

= E 


J x(nA)e~ J2,tfi,A e i2wfrA df 


-1/2A 


jt(«A) 


1/2A 

/■ 

-1/2 A 


-jhtfin- 


r)A df 


= -v(rA)— (5.5) 


Thus, we summarize the Fourier transform pair for the ‘sampled sequence’ as below, 
where we rename X s (f) as X(e^ 2n,A )'. 


OO 


X(e i2nfA ) = x(nA)e- i2nfnA 

(5.6) 

1/2A 


x(nA) — A I X(e i2nfA )e }2nfnA df 

(5.7) 

-1/2A 



Note that the scaling factor A is present in Equation (5.7). 


The Link Between X(e j2ndA ) and X(f) 

At this stage, we may ask: ‘How is the Fourier transform of a sequence X(e jlKfA ) related 
to the Fourier transform of a continuous signal X(f)T In order to answer this, we need to 
examine the periodicity of X(e j2x f A ) as follows. 

Note that i(t) in Equation (5.1) is a periodic signal with period A, thus it has a Fourier 
series representation. Since the fundamental period T P = A, we can write the train of delta 
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functions as (see Equation (3.34)) 


OO 

i(t) = J2 c ne J2nn,/A 

n =— oo 


(5.8) 


where the Fourier coefficients are found from Equation (3.35) such that 

A/2 

Cn — ~ 


■„ = _L f i(t)e~ j2lzn,/A dt = i 
A J A 

-A/2 


(5.9) 


Thus, Equation (5.8) can be rewritten as 


^ E 


j2jtb»/A 


(5.10) 


(Recall Equation (3.30) which is equivalent to this.) Using the property of the delta 
function j°° e ±j27lal dt = 8(a), the Fourier transform of Equation (5.10) can be calculated 
as 


OO 

1(f) = F{i(t)) = f 
J 

1 00 

_ y' e j2nnt/A 

A 

i oo 

e-J 7 *fidt= - y 

A E^ 

OO 

— OO 

n =— oo 

Yl— — OO 

—oo _ 



OO 

J e -j2n(f-n/A)t dt 

^-oo 



n n 
A^ 


(5.11) 


Thus, the Fourier transform of the train of delta functions can be drawn in the frequency 
domain as in Figure 5.3. 

Since the Fourier transform of x s (t) results in the convolution of X(f) with 1(f) in the 
frequency domain, i.e. X s (f ) = F{x(t)i(t ) } = X(f) * 1(f), it follows that 


OC 

/ 


X,(f) = l(f)*x(f)= / I(g)X(f — g)dg = / - 


OO 

/ -I oo 

n=— no 


= - T 

A 


oo 

/*(*-!) 


“ -r) x (f-g) d 8 


r )X(f — g)dg 


1 OO 


(5.12) 


i(f) 


t i i 

> 1 A -1/A 


-2/ A -1/A 


J La 

1/A 2/A 


Figure 5.3 Fourier transform of the train of delta functions 
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This gives an alternative form of Equation (5.6), which is 

| °° 

X(e 2 ^ fA ) = - (5-13) 

n =— 00 

This important equation describes the relationship between the Fourier transform of a 
continuous signal and the Fourier transform of a sequence obtained by ideal sampling 
every A seconds. That is, the Fourier transform of the sequence x(n A) is the sum of shifted 
versions of the Fourier transform of the underlying continuous signal. This reinforces the 
periodic nature of X(e^ 2n ^ A ). Note also that the ‘scaling’ effect 1/A, i.e. the sampling 
rate f s = 1/A, is a multiplier of the sum in Equation (5.13). 

So, the ‘sampling in the time domain’ implies a ‘periodic and continuous structure 
in the frequency domain’ as illustrated in Figure 5.4. From Equation (5.6), it can be 
seen that X s (f s — f) — X*(f), where * denotes complex conjugate. This is confirmed 
(for the modulus) from Figure 5.4. Thus, all the information in X s (f) lies in the range 
0 < / < f s /2. This figure emphasizes the difference between X(f) and X(e jl7r ^ A ), and 
leads to the concept of ‘aliasing", which arises from the possible overlapping between 
the replicas of X(f). This will be discussed further in the next section. 


A|X(e^ /A )| 



Figure 5.4 Fourier transform of the sampled sequence 


An Alternative Route to the Derivation of the Fourier Transform 
of a Sequence 

The z-transform 

The expression for the Fourier transform of a sequence. Equation (5.6), can also be obtained 
via the z-transform of a sequence. The z-transform is widely used in the solution of difference 
equations, just as the Laplace transform is used for differential equations. The definition of 
the z-transform X(z) of a sequence of numbers x(n) is 

OO 

X(z)= J2 x(n > z ' n (5T4) 

n = — 00 

where z is the complex- valued argument of the transform and X(z) is a function of a complex 
variable. In Equation (5.14), the notion of time is not explicitly made, i.e. we write x(n) for 
x(n A). It is convenient here to regard the sampling interval as set to unity. Since z is complex, 
it can be written in polar form, i.e. using the magnitude and phase such that z = re JW , and is 
represented in a complex plane (polar coordinates) as shown in Figure 5.5(a). If this expression 
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Im(z) Im(z) 




(a) Representation of the z-plane (b) Relationship between X ( z ) and X ( e 

Figure 5.5 Representation of the z-plane and the Fourier transform of a sequence 


is substituted into Equation (5.14), it gives 

OO OO 

X(re im ) = J2 x(n)(re Ja, )~ n = J2 x(n)r n e~ j ‘ on (5.15) 

n =— 00 n =— 00 

If we further restrict our interest to the unit circle in the z-plane, i.e. r = 1, so z = e Jco = e ;2lr ^, 
then Equation (5.15) is reduced to 

OO 

X(e j2l,f ) = J2 x{n)e- i2,lfn (5.16) 

n=—oo 

which is exactly same form for the Fourier transform of a sequence as given in Equation (5.6) 
for sampling interval A = 1 . 

Thus, it can be argued that the evaluation of the z-transform on the unit circle in the 
z-plane yields the Fourier transform of a sequence as shown in Figure 5.5(b). This is analogous 
to the continuous-time case where the Laplace transform reduces to the Fourier transform if 
it is evaluated on the imaginary axis, i.e. s = jco. 


Relationship Between the Laplace Transform and the z-transform 

To see the effect of sampling on the z-plane, we consider the relationship between the Laplace 
transform and the z-transform. The Laplace transform of x(t), L{x(t)}, is defined as 


-/ 


X(s) = / x(t)e s 'dt 


(5.17) 


where.? = a + j2nf is a complex variable. Note that if? = j2nf, then X{f) = X(s)\ s= j 2 n f. 
Now, let X(?) be the Laplace transform of an (ideally) sampled function; then 


X(s) = L{x{t)i(t ) } = 


OO 

/ OO 

E 

n= — nc 


S(t — n A)e s 'dt 


= E 


OO 

/ 


x(t)e s, 8(t — nA)dt 


= ^ x(n A)e 


(5.18) 
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If z = e sA , then Equation (5.18) becomes the z-transform, i.e. 

OO 

XCO| z= „ S A = J2 = *(Z) (5.19) 

n =— oo 


Comparing Equation (5.19) with Equation (5.6), it can be shown that if z = e J2,I ^ A , then 


X(e^ A ) = X{.z)\ z=el ^ 


(5.20) 


Also, using the similar argument made in Equation (5.13), i.e. using 

1 


‘ W = A E 


9 j2nnt/A 


an alternative form of X(i) can be written as 


OO OO 

/ i OO 1 OO n 

x(t)~ J2 e i2wn,/A e- s, dt = - ^ / x{t)e- (s -> lnnlA), dt 

n=— no n=—nc ” 


Thus, 


1 ^ / jinn 

*M_- £ * V 


(5.21) 


(5.22) 


From Equation (5.22), we can see that X(s) is related to X(.j) by adding shifted ver- 
sions of scaled A(j) to produce X(i) as depicted in Figure 5.6(b) below, which is similar 
to the relationship between the Fourier transform of a sampled sequence and the Fourier 
transform. 

Note that, as we can see from Equation (5.19), X{z) is not directly related to A(j), but 
it is related to X(s) via z = e sA . Further, if we let s = j2nf, then we have the following 
relationship: 

XAf) (= X(e^ A )) = X{s)\ s=j2nf = X(z)\ £ = gj'2,nf A (5.23) 

The relationships between A(j), X(j) and X(z) are illustrated in Figure 5.6. In this figure, a 
pole is included in the j-plane to demonstrate how it is mapped to the z-plane. We can see 
that a single pole in the A(^)-plane results in an infinite number of poles in the X(i)-plane; 
then this infinite series of poles all map onto a single pole in the X(z)-plane. In effect, the left 
hand side of the .s-plane is mapped to the inside of the unit circle of the z-plane. However, we 
must realize that, due to the sampling process, what it maps onto the z-plane is not X(j), but 
X(s), and each ‘strip’ in the left hand side of Afs) is mapped onto the z-plane plane such that 
it fills the complete unit circle. This indicates the ‘periodic’ structure in the frequency domain 
as well as possible aliasing in the frequency domain. 

The above mapping process is sometimes used in designing an HR (Infinite Impulse 
Response) digital filter from an existing analogue filter, and is called the impulse-invariant 
method. 
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(a) Analogue (continuous-time) domain: 



(b) Laplace transform of a sampled function: 




(c) Digital (discrete-time) domain: 



5.2 ALIASING AND ANTI-ALIASING FILTERS 1 * 15 1-5 3 

As noted in the previous section. Equation (5.13) describes how the frequency components 
of the sampled signal are related to the Fourier transform of the original continuous signal. 
A pictorial description of the sampling effect follows. Consider the Fourier transform that 
has X(f) — 0 for | f\ > f H , as given in Figure 5.7. 
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X(f) 



/ff J H 


Figure 5.7 Fourier transform of a continuous signal such that X(f) = 0 for |/| > f H 

Assuming that the sampling rate f s = 1 /A is such that > 2 /#, i.e. f H <l /2A, 
then Figure 5.8 shows the corresponding (scaled) Fourier transform of a sampled sequence 
A ■ X s (f) (or A ■ X(e j2lI f A )). Note that the scaling factor A is introduced (see Equa- 
tion (5.13)), and some commonly used terms are defined in the figure. Thus A ■ X s (f) 
accurately represents X(f) for | f\ < 1/2A. 


A ■ X(e Nyquist 



Figure 5.8 Fourier transform of a sampled sequence f s > 2 f H 

Suppose now that f s < 2 f H . Then there is an overlapping of the shifted versions of 
X(f) resulting in a distortion of the frequencies for |/| < 1 /2A as shown in Figure 5.9. 





Figure 5.9 Fourier transform of a sampled sequence f s < 2 f H 


This ‘distortion’ is due to the fact that high-frequency components in the signal 
are not distinguishable from lower frequencies because the sampling rate f s is not high 
enough. Thus, it is clear that to avoid this distortion the highest frequency in the signal 
fu should be less than f s /2. This upper frequency limit is often called the Nyquist frequency 
(see Figure 5.8). 

This distortion is referred to as aliasing. Consider the particular case of a harmonic wave 
of frequency p Hz, e.g. cos(2npt) as in Figure 5.10. We sample this signal every A seconds, i.e. 
f s = 1 / A (with, say, p < f s /2), to produce the sampled sequence cos(2n pn A). Now, consider 
another cosine wave of frequency (p + l/A)Flz, i.e. cos[2 n{p + 1/A)f]; again we sample 
this every A seconds to give cos[2jr(p + l/A)nA] which can be shown to be cos(2npnA), 
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cos[2/r{ p + 1/A J r] 


cos(2jrpt) 


Figure 5.10 Illustration of the aliasing phenomenon 


identical to the above. So, simply given the sample values, how do we know which cosine 
wave they come from? 

In fact, the same sample values could have arisen from any cosine wave hav- 
ing frequency ±p + (k/ A) (k = 1, 2, ...), i.e. cos{2npn A) is indistinguishable from 
cos[2?r(±p + k/A)nA\. So if a frequency component is detected at p Hz, any one of these 
higher frequencies can be responsible for this rather than a ‘true’ component at p Hz. This 
phenomenon of higher frequencies looking like lower frequencies is called aliasing. The val- 
ues ±p + k/A are possible aliases of frequency p Hz, and can be seen graphically for some 
p Hz between 0 and 1/2 A by ‘pleating’ the frequency axis as shown in Figure 5.1 1 (Bendat 
and Piersol, 2000). 


J_ _5_ 

2A 2A 2A 



AAA 
Figure 5.11 Possible aliases of frequency p Hz 


To avoid aliasing the signal must be band-limited, i.e. it must not have any frequency 
component above a certain frequency, say f H , and the sampling rate must be chosen to 
be greater than twice the highest frequency contained in the signal, namely 

fs > 2 f H (5.24) 

So, it would appear that we need to know the highest frequency component in the signal. 
Unfortunately, in many cases the frequency content of a signal will not be known and so 
the choice of sampling rate is problematic. The way to overcome this difficulty is to filter 
the signal before sampling, i.e. filter the analogue signal using an analogue low-pass 
filter. This filter is often referred to as an anti-aliasing filter. 


Anti-aliasing Filters 

In general, the signal x(t) may not be band-limited, thus aliasing will distort the spectral 
information. Thus, we must eliminate ‘undesirable’ high-frequency components by applying 
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Figure 5.12 Typical characteristics of a low-pass filter 


an anti-aliasing low-pass filter to the analogue signal prior to digitization. The ‘anti-aliasing’ 
filter should have the following properties: 

• flat passband; 

• sharp cut-off characteristics; 

• low distortion (i.e. linear phase characteristic in the passband); 

• multi-channel analysers need a set of parallel anti-aliasing filters which must have matched 
amplitude and phase characteristics. 

Filters are characterized by their frequency response functions H(f), e.g. as shown in 
Figure 5.12. 

Some typical anti-aliasing filters are shown in Figure 5.13. 



(a) Butterworth low-pass filter 


Half power 
(-3 dB) 


\H(co)\ 2 Ripple 

i/' Type I 

[ 1\A/w|| 

equiripple passband 
monotonic stopband 


0.5 


- Fast cut-off 




\H(a »\ 2 


1 

0.5 

0 


Type II 

monotonic passband 
equiripple stopband 


Kv— > 0] 


<>>c 


(b) Chebychev low-pass filter 


Figure 5.13 Some commonly used anti-aliasing low-pass filters 
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We shall assume that the anti-aliasing filter operates on the signal x(t) to produce a 
signal to be digitized as illustrated in Figure 5.14. 


x(t) 

Anti-aliasing filter, 


ADC 

x(nA) 






Figure 5.14 The use of anti-aliasing filter prior to sampling 

But we still need to decide what the highest frequency f H is just prior to the ADC 
(analogue-to-digital converter). The critical features in deciding this are: 

• the ‘cut-off’ frequency of the filter f c , usually the f c (Hz) = 3 dB point of the filter; 

• the ‘roll-off rate’ of the filter in dB/octave (B in Figure 5.15); 

• the ‘dynamic range’ of the acquisition system in dB (A in Figure 5.15). (Dynamic range 
is discussed in the next section.) 

These terms and the effect of sampling rate are depicted in Figure 5.15. Note that, in this 
figure, if fa > 2f stop ( sa 2 fn) there is no aliasing, and if fa > 2 there is no aliasing up 
to f c . Also note that it is not the 3 dB point of the filter which should satisfy the Nyquist 
criterion. But at the Nyquist frequency the filter response should be negligible (e.g. at 
least 40 dB down on the passband). 



Figure 5.15 Characteristics of the anti-aliasing filter 


If the spectrum is to be used up to f c Hz, then the figure indicates how fa is chosen. Using 
simple trigonometry. 


A 

log nifstop/fc) 


— B (dB/octave) 


(5.25) 


Note that if B is dB/decade, then the logarithm is to base 10. Some comments on the octave 
are: if fa = 2” fa, then /2 is V octaves; thus, log 2 fa = n + log 2 fa and log 2 fa — log 2 fa = 
l°g 2 (/ 2 //i) = n (octaves). 

From Equation (5.25), it can be shown that fa lop = 2 A ^ B fa. Substituting this expression 
into fa > f stop + f c , which is the condition for no aliasing up to the cut-off frequency// 
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(see Figure 5.15), then we have the following condition for the sampling rate: 


f s > Ml + 2 A ' B ) « / e (l + io°- 3A / fi ) 


(5.26) 


For example, if A = 70 dB and B = 70 dB/octave, then f s > 3/ c , and if A = 70 dB and B = 
90dB/octave, then /, > / c (l + 2 70 / 90 ) « 2.1 f c . However, the following practical guide 
(which is based on twice the frequency at the noise floor) is often used: 


/. = 2 f slop M 2 f H ) « 2 x 10 0 3A / fi /c 


(5.27) 


For example, if A = 70dB and B = 90dB/octave, then /', sa 3.42/ c , which gives a more 
conservative result than Equation (5.26). 

In general, the cut-off frequency f c and the roll-off rate of the anti-aliasing filter should be 
chosen with the particular application in mind. But, very roughly speaking, if the 3 dB point of 
the filter is a quarter of the sampling rate f s and the roll-off rate better than 48 dB / octave, then 
this gives a 40 to 50 dB reduction in the folding frequency f s /2. This may result in an acceptable 
level of aliasing (though we note that this may not be adequate for some applications). 

Choosing an appropriate sampling rate is important. Although we must avoid aliasing, 
unnecessarily high sampling rates are not desirable. The ‘optimal’ sampling rate must be se- 
lected according to the specific applications (the bandwidth of interest) and the characteristics 
of the anti-aliasing filter to be used. 

There is another very important aspect to note. If the sampled sequence x(n A) is sampled 
again (digitally, i.e. downsampled), the resulting sequence can be aliased if an appropriate 
anti-aliasing ‘digital’ low-pass filter is not applied before the sampling. This is demonstrated 
by MATLAB Examples 5.2 and 5.3. Also note that aliasing does occur in most computer 
simulations. For example, if a numerical integration method (such as the Runge-Kutta method) 
is applied to solve ordinary differential equations, in this case there is no simple way to avoid 
the aliasing problem (see comments of MATLAB Example 6.5 in Chapter 6). 


5.3 ANALOGUE-TO-DIGITAL CONVERSION AND 
DYNAMIC RANGE 

An ADC is a device that takes a continuous (analogue) time signal as an input and produces 
a sequence of numbers (digital) as an output that are sample values of the input. It may be 
convenient to consider the ADC process as consisting of two phases, namely sampling and 
quantization, as shown in Figure 5.16. 

Note that actual ADCs do not consist of two separate stages (as in the conceptual figure), 
and various different types are available. In Figure 5.16, x(n A) is the exact value of time signal 
x(t) at time t = nA, i.e. it is the ideally sampled sequence with sample interval A. x(nA) is 


1 


ADC 


x(t) 1 

Sampler 

x(nA) 

Quantizer 


1 



1 _ 





Figure 5.16 Conceptual model of the analogue-to-digital conversion 
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the representation of x(n A) on a computer, and is different from x(n A) since a ‘finite number 
of bits’ are used to represent each number. Thus, we can expect that some errors are produced 
in the quantization process. 

Now, consider the problem of quantization, in Figure 5.17. 


x(nA) 


Quantizer 


x(«A) 


Figure 5.17 Quantization process 

Suppose the ADC represents a number using 3 bits (and a sign bit), i.e. a 4 bit ADC as 
given in Figure 5.18. 


Sign bit ‘ v ' 

Digital word 

Figure 5.18 A digital representation of a 4 bit ADC 

Each bit is either 0 or 1, i.e. two states, so there are 2 3 = 8 possible states to represent a 
number. If the input voltage range is ±10 volts then the 10 volts range must be allocated to 
the eight possible states in some way, as shown in Figure 5.19. 


Digital *(" A ) 


representation ‘ 

Oil j 

010 (=20/8) ■ 

oio 1 

001 (= 10/8) ■ 


000 


25 _ 15 -l\ 
8 8 8j 

5 15 25 

8 8 8 

i — ! 



-► x(«A) 
Input voltage 


Figure 5.19 Digital representation of an analogue signal using a 4 bit ADC 

In Figure 5.19, any input voltage between —5/8 and 5/8 volts will be represented by 
the bit pattern [000], and from 5/8 to 15/8 volts by [010], etc. The rule for assigning the bit 
pattern to the input range depends on the ADC. In the above example the steps are uniform 
and the ‘error’ can be expressed as 

e(n A) = x(n A) — jt(tiA) (5.28) 

Not that, for the particular quantization process given in Figure 5.19, the error e{n A) has 
values between —5/8 and 5/8. This error is called the quantization noise (or quantization 
error). From this it is clear that ‘small’ signals will be poorly represented, e.g. within the input 
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voltage range of ±10 volts, a sine wave of amplitude ±1.5 volts, say, will be represented by 
the 4 bit ADC as shown in Figure 5.20. 


x(nA) 


Quantized signal: x(nA) = x(nA) + e(nA) 


20/8 - - 
10/8 - 




x(l) 


n~T. TV-- 


• nA 


- 10 / 8 - 
- 20/8 - 


■vr 


Figure 5.20 Example of poor digital representation 


What will happen for a sine wave of amplitude ±10 volts and another sine wave of 
amplitude ± 1 1 volts? The former corresponds to the maximum dynamic range , and the latter 
signal will be clipped. 

Details of quantization error can be found in various references (Oppenheim and Schafer, 
1975; Rabiner and Gold, 1975; Childers and Durling, 1975; Otnes and Enochson, 1978). 
A brief summary is given below. The error e(n A) is often treated as random ‘noise’. The 
probability distributions of e(nA) depend on the particular way in which the quantization 
occurs. Often it is assumed that this error has a uniform distribution (with zero mean) over 
one quantization step, and is stationary and ‘white’. The probability density function of e(n A) 
is shown in Figure 5.21, where S — X/2 b for a b bit word length (excluding the sign bit), and 
X (volts) corresponds to the full range of the ADC. Note that S = 10/2* = 10/2 3 = 10/8 in 
our example above. The variance of e(nA) is then 


OC 

1 = 0f= / 


Var (e) = of = / (e - fx e ) 2 p(e)de = 


8/2 

-if- 

-a /2 


e~de 


S 

12 


2 (X/2 b ) 2 


12 


(5.29) 


where p. e is the mean value of e(nA). (See Chapter 7 for details of statistical quantities.) 


p(e) 


1 




5 




-S/2 

< 5/2 ' 


Figure 5.21 Probability density function of e(n A) 


Now, if we assume that the signal x(t ) is random and a 2 is the variance of .r(;;A), then a 
measure of signal-to-noise ratio (SNR) is defined as 


S = 101og 10 (" Slgnal P ° Wel N j = 101og 10 (for zero mean) 

N \ error power ) \ a e - J 


(5.30) 
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where ‘signal power’ is Average[x 2 (n A)] and ‘error power’ or the quantization noise is 
Average[e 2 (n A)]. This describes the dynamic range (or quantization signal-to-noise ratio) 
of the ADC. Since we assume that the error is random and has a uniform probability density 
function, for the full use of the dynamic range of ADC with b bit word length, e.g. cr x = X, 
Equation (5.30) becomes 

f = 101og 10 (<r 2 /cr 2 ) = 101og 10 [12X 2 /(A/2") 2 ] 

= 101og 10 (12 x 2 2fc ) kb 10.8 + 6b dB (5.31) 


For example, a 12 bit ADC (1 1 bit word length) has a maximum dynamic range of about 
77 dB. Flowever, we note that this would undoubtedly result in clipping. So, if we choose 
a x = X/ 4 to ‘avoid’ clipping, then the dynamic rage is reduced to 

= 10 log 10 (a 2 Ja 2 e ) « 6b - 1 .25 dB (5.32) 

In this case, a 12 bit ADC gives a dynamic range of about 65 dB. This may be reduced 
further by practical considerations of the quality of the acquisition system (Otnes and 
Enochson, 1978). For example, the sampler in Figure 5.16 cannot be realized with a 
train of delta functions (thus producing aperture error and jitter). Nevertheless, it is 
emphasized that we must avoid clipping but always try to use the maximum dynamic 
range. 


5.4 SOME OTHER CONSIDERATIONS IN SIGNAL ACQUISITION 

Signal Conditioning 

We have already noted that signals should use as much of the ADC range as possible — but 
without overloading — or clipping of the signal will occur. ‘Signal conditioning’ refers to the 
procedures used to ensure that ‘good data’ are delivered to the ADC. This includes the correct 
choice of transducer and its operation and subsequent manipulation of the data before the ADC. 

Specifically, transducer outputs must be ‘conditioned’ to accommodate cabling, environ- 
mental considerations and features of the recording instrumentation. Conditioning includes 
amplification and filtering, with due account taken of power supplies and cabling. For exam- 
ple, some transducers, such as strain gauges, require power supplies. Considerations in this 
case include: stability of power supply with little ripple, low noise, temperature stability, low 
background noise pick-up, low interchannel interference, etc. 

Amplifiers: Amplifiers are used to increase (or attenuate) magnitudes in a calibrated fashion; 
transform signals from one physical variable to another, e.g. charge to voltage; remove d.c. 
biases; provide impedance matching, etc. The most common types are voltage amplifier, 
charge amplifier, differential amplifier, preamplifier, etc. In each case, care should be taken 
to ensure linearity, satisfactory frequency response and satisfactory ‘slew rate’ (i.e. response 
to maximum rate of rise of a signal). In any case, the result of amplification should not cause 
‘overload’ which exceeds the limit of input (or output) range of a device. 
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Filters: Filters are used to limit signal bandwidth. Typically these are low-pass filters (anti- 
aliasing filters), high-pass filters, band-pass filters and band-stop filters. (Note that high-pass 
and band-stop filters would need additional low-pass filtering before sampling.) Most filters 
here are ‘analogue’ electronic filters. Sometimes natural ‘mechanical filtering’ is very helpful. 

Cabling: Cabling must be suited to the application. Considerations are cable length, impedance 
of cable and electronics, magnetic and capacitive background noise, environment, interfer- 
ences, transducer type, etc. 

Triboelectric noise (static electricity) is generated when a coaxial cable is used to connect 
a high-impedance piezoelectric transducer to a charge amplifier, and undergoes mechanical 
distortion. Grounding must be considered. Suitable common earthing must be established 
to minimize electromagnetic interference manifesting itself as background noise. Shielding 
confines radiated electromagnetic energy. 

Note that none of the considerations listed above is ‘less important’ to obtain (and generate) 
good data. A couple of practical examples are demonstrated below. First, consider generating a 
signal using a computer to excite a shaker. The signal must pass through a digital-to-analogue 
converter (DAC), a low-pass filter (or reconstruction filter) and the power amplifier before 
being fed into the shaker. Note that, in this case, it is not only the reconstruction filter, but also 
the power amplifier that is a filter in some sense. Thus, each device may distort the original 
signal, and consequently the signal which the shaker receives may not properly represent 
the original (or intended) signal. The frequency response of the power amplifier in particular 
should be noted carefully. Most power amplifiers have a band-limited frequency response 
with a reasonably high enough upper frequency limit suitable for general sound and vibration 
problems. However, some have a lower frequency limit (as well as the upper limit), which 
acts as a band-pass filter. This type of power amplifier can distort the signal significantly if 
the signal contains frequency components outside the frequency band of the amplifier. For 
example, if a transient signal such as a half-sine pulse is fed to the power amplifier, the output 
will be considerably distorted owing to the loss of energy in the low-frequency region. This 
effect is shown in Figure 5 .22, where a half-sine wave is generated by a computer and measured 
before and after the power amplifier which has a lower frequency limit. 


/ \ Power amplifier 

(with a lower frequency limit) 

Half-sine wave Distorted response 

Figure 5.22 Example of distortion due to the power amplifier 

As another practical example, consider the beam experimental setup in Chapter 1 
(Figure 1.11). In Figure 1.11, all the cables are secured adequately to minimize additional 
dynamic effects. Note that the beam is very light and flexible, so any excessive movement and 
interference of the cables can affect the dynamics of the beam. Now, suppose the cable 
connected to the accelerometer is loosely laid down on the table as shown in Figure 5.23. 
Then, the movement of the beam causes the cable to slide over the table. This results in ad- 
ditional friction damping to the structure (and also possibly additional stiffness). The system 
frequency response functions for each case are shown in Figure 5.24, where the effects of this 
cable interference are clearly seen. 
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Figure 5.23 Experiment with cable interference 



Figure 5.24 FRF of the system with/without cable interference 


Data Validation 

As demonstrated in the above experimental results, every possible effort should be made 
early in any experiment to ensure good data are captured. Data validation refers gener- 
ally to the many and varied checks and tests one may perform prior to ‘serious’ signal 
processing. This will occur at both analogue and digital stages. Obviously it would be 
best always to process only ‘perfect’ signals. This ideal is impossible and a very clear 
understanding of any shortcomings in the data is vital. 

A long list of items for consideration can be compiled, some of which are as follows: 

• Most signals will be recorded, even if some real-time processing is carried out. Identify 
any physical events for correlation with data. 

• Inspect time histories critically, e.g. if periodic signals are expected, check for other 
signals such as noise, transients. 

• Ensure non-stationary signals are adequately captured and note any changing ‘physics’ 
that might account for the non-stationarity. 




SHANNON’S SAMPLING THEOREM (SIGNAL RECONSTRUCTION) 


137 


• Check for signal clipping. 

• Check for adequate signal levels (dynamic range). 

• Check for excessive background noise, sustained or intermittent (spikes or bursts). 

• Check for power line pick-up. 

• Check for spurious trends, i.e. drifts, d.c. offsets. 

• Check for signal drop-outs. 

• Check for ADC operation. 

• Check for aliasing. 

• Always carry out some sample analyses (e.g. moments, spectra and probability densi- 
ties, etc; these statistical quantities are discussed in Part II of this book). 


5.5 SHANNON’S SAMPLING THEOREM (SIGNAL RECONSTRUCTION) 


This chapter concludes with a look at digital-to-analogue conversion and essentially starts 
from the fact that, to avoid aliasing, the sampling rate f s should be greater than twice the 
highest frequency contained in the signal. This begs a fundamental question: is it possible to 
reconstruct the original analogue signal exactly from the sample values or has the information 
carried by the original analogue signal been lost? As long as there is no aliasing, we can indeed 
reconstruct the signal exactly and this introduces the concept of an ideal digital-to-analogue 
conversion. This is simple to understand using the following argument. 

Recall the pictorial representation of the Fourier transforms of a continuous signal x(t) and 
its sampled equivalent x(n A), i.e. X(f) and X(e i2jl ^ A ) respectively, as shown in Figure 5.25. 
The figure shows the situation when no aliasing occurs. Also, note the scale factor. 


X(f) 



~fi i fn 

AX(e’ 2 " fA ) 



~fs J. /h fs fs 


2 2 

Figure 5.25 Fourier transforms: X(f) and X(e J2 *f A ) 


In digital-to-analogue conversion, we want to operate on.t(nA) (equivalently X(e j2n ^ A )) 
to recover x(t) (equivalently X(f)). It is clear that to achieve this we simply need to multiply 
X(e j2lT,A ) by a frequency window function H(f ), where 

H{f) = A(= 1 //,) — /j/2 < / < /i/2 

= 0 elsewhere 


Then 


X(f) = H(f)X(e Jl7lfA ) 


(5.34) 
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Taking the inverse Fourier transform of this gives 


x(t) = h(t) * x(n A) 


(5.35) 


where 


m = 


sin nf s t 

Ttfst 


(5.36) 


Note that Equation (5.35) is not a mathematically correct expression. Thus, using the ex- 
pression for x(nA) as x s (t) = x(t)i(t) where i(t) = X^-oo ~ ''A), then Equation (5.35) 
becomes 


x(t) = h(t) * x s (t) = J 

— OO 

oo 

- E 


sin^/jT 


y, x(t — r)S(t — n A — r) 


= x(tiA) 


nt 

OO 

/ sinTTfcT 

x{t — r)8(t — n A — T)dr 

nf s r 

— OO 

sin7r f s (t — n A) 


dz 


jrf s (t ~ n A) 


(5.37) 


i.e. the ‘ideal’ interpolating function is the sine function of the form sinx/x. Equation 
(5.37) can be depicted as in Figure 5.26 which shows how to reconstruct x(f) at time t 
that requires the infinite sum of scaled sine functions. 



Figure 5.26 Graphical representation of Equation (5.37) 


Note that, with reference to Figure 5.25, if the highest frequency component of the 
signal is fn then the window function H(f) need only be A for |/| < f H and zero 
elsewhere. Using this condition and applying the arguments above, the reconstruction 
algorithm can be expressed as 


x(f) = 


OO 

y 1 x(/?a) 

n =— oo 


2 f H sin 2nf H {t - n A) 
fs 2nfn(t-nA) 


(5.38) 


This result is called Shannon ’s sampling theorem. 


This ideal reconstruction algorithm is not fully realizable owing to the infinite summation, 
and practical digital-to-analogue converters (DACs) are much simpler — notably the zero-order 
hold converter. Typical digital-to-analogue conversion using the zero-order hold is shown in 
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x(nA) 

x(n&) 


Zero-order hold 

m 

Low-pass 


DAC 


filter 



x(t) x(t) = x(t) 


f 


£ 




3^ \ljr« 






Figure 5.27 Reconstruction of a signal using a zero-order hold DAC 

Figure 5.27. The zero-order hold DAC generates a sequence of rectangular pulses by holding 
each sample for A seconds. The output of the zero-order hold DAC, however, inevitably 
contains a large amount of unwanted high-frequency components. Thus, in general, we need 
a low-pass filter to eliminate these high frequencies following the DAC. This low-pass filter 
is often called the reconstruction filter (or anti-imaging filter), and has a similar (or identical) 
design to the anti-aliasing low-pass filter. The cut-off frequency of the reconstruction filter is 
usually set to half the sampling rate, i.e. f s / 2 . 

Note that not only does the zero-order hold DAC produce undesirable high frequencies, 
but also its frequency response is no longer flat in both magnitude and phase (it has the shape 
of a sine function). Thus the output signal x(t) has reduced amplitude and phase change in 
its passband (frequency band of the original (or desired) signal x(t)). To compensate for this 
effect, a pre-equalization digital filter (before the DAC) or post-equalization analogue filter 
(after the reconstruction filter) is often used. Another method of reducing this effect is by 
‘increasing the update rate’ of the DAC. Similar to the sampling rate, the update rate is the 
rate at which the DAC updates its value. 

For example, if we can generate a sequence r(iiA) in Figure 5.27 such that 1/A is much 
higher than f H (the highest frequency of the desired signal x(t)), and if the DAC is capable 
of generating the signal accordingly, then we have a much smoother analogue signal x(t), i.e. 
x(t) x(t). In this case, we may not need to use the reconstruction filter. In effect, for a given 
band-limited signal, by representing x(t) using much narrower rectangular pulses, we have 
the frequency response of the DAC with flatter passband and negligible high-frequency side 
roll-off of the sine function (note that the rectangular pulse (or a sine function in the frequency 
domain) can be considered as a crude low-pass filter). Since many modern DAC devices have 
an update rate of 1MHz or above, in many situations in sound and vibration applications, we 
may reasonably approximate the desired signal simply by using the maximum capability of 
the DAC device. 


5.6 BRIEF SUMMARY 


1 . The Fourier transform pair for a sampled sequence is given by 
1/2 A 

x( ' ; 

-1/2A 

In this case, the scaling factor A is introduced. 


I 

r> OO 

c(nA) = A / X(e J27rfA )e i27rfi,A df and X{e j2nfA ) = x{n/X)e 

” 71 = — OO 


jin fntx 
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2. The relationship between the Fourier transform of a continuous signal and the Fourier 
transform of the corresponding sampled sequence is 

| OO 

\ E *(/-£) 

n=—o o 

i.e. X(e j2yT f A ) is a continuous function consisting of replicas of scaled X(f), and is 
periodic with period 1/A. This introduces possible aliasing. 

3. To avoid aliasing, an ‘analogue’ low-pass filter (anti-aliasing filter) must be used before 
the analogue-to-digital conversion, and the sampling rate of the ADC must be high 
enough. In practice, for a given anti-aliasing filter with a roll-off rate of BdB/octave 
and an ADC with a dynamic range of A dB, the sampling rate is chosen as 

/, 2 x 10 03A/B f c 

4. To obtain ‘good’ data, we need to use the maximum dynamic range of the ADC (but 
must avoid clipping). Also, care must be taken with any signal conditioning, filters, 
amplifiers, cabling, etc. 

5. When generating an analogue signal, for some applications, we may not need a 
reconstruction filter if the update rate of the DAC is high. 


5.7 MATLAB EXAMPLES 


Example 5.1: Demonstration of aliasing 

Case A: This example demonstrates that the values =tp + fe/AHz become aliases of 
frequency p Hz. (see Figure 5.11). 

Consider that we want to sample a sinusoidal signal x(t) = s'mlnpt with the sam- 
pling rate f s = 100 Hz. We examine three cases: x\ (f) = sin27rpi/, % 2 (t) = sin2np 2 t 
and X 3 (t) = sin 2 jrp 3 / where p\ = 20 Hz, p 2 = 80 Hz and pi = 120 Hz. Note that all 
the frequencies will appear at the same frequency of 20 Hz. 


Line MATLAB code 


Comments 


1 clear all Define the sampling rate fs = 100 Hz, total record 

2 fs=100; T=10; time T = 10 seconds, and the time variable t from 0 

3 t=0:l/fs:T-l/fs; to ‘T-l/fs’ seconds. Also define the frequencies for 

4 pl=20; p2=80; p3=120; each sinusoid. 

5 xl=sin(2*pi*pl*t); Generate the signals X\(t). x 2 (t) and .v 3 (/). Note that 

6 x 2 =sin( 2 *pi*p 2 *t); all these signals use the same time variable ‘t’, thus it 

7 x 3 =sin( 2 *pi*p 3 *t)' has the same sampling rate. 


8 N=length(t); 


Perform the DFT of each signal, and calculate the 
frequency variable f. 
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9 Xl=fft(xl);X2=fft(x2); 
X3=fft(x3); 

10 f=fs*(0:N-l)/N; 

1 1 figure( 1 ) ; plot(f, abs(X 1 )/fs/T) 

12 xlabel('Frequency (Hz)'); 
ylabel('Modulus') 

13 axis([0 100 0 0.55]) 

14 figure(2); plot(f, abs(X2)/fs/T) 

15 xlabel('Frequency (Hz)'); 
ylabel('Modulus') 

16 axis([0 100 0 0.55]) 

17 figure(3); plot(f, abs(X3)/fs/T) 

18 xlabel('Frequency (Hz)'); 
ylabel('Modulus') 

19 axis([0 100 0 0.55]) 


Plot the modulus of the DFT of x\ (t) = sin27r(20 )t 
for the frequency range 0 Hz to 100 Hz (i.e. up to the 
sampling frequency). Note that the right half of the 
graph is the mirror image of the left half (except the 
0 Hz component). 

Plot the modulus of the DFT of % 2 (t) = sin 27 t(80)/. 


Plot the modulus of the DFT of *3(7) = sin 27r( 120)7. 


Results 


0.5 



p x = 20 Hz, 

0.4 



and aliases of 

p 2 =80 Hz, p 3 =120 Hz 

0.3 




0.2 




0.1 



fs/ 2 
/ 




/ 


fs - 

\ 


0 10 20 30 40 50 60 70 80 90 100 

Frequency (Hz) 


Comments: Note that all the frequencies pi = 20 Hz, pi = 80 Hz and = 120 Hz 
appear at the same frequency 20 Hz. 


Example 5.2: Demonstration of aliasing 

Case B: This example demonstrates the aliasing problem on the ‘digital’ sampling of a 
sampled sequence x{n A). 

Consider a sampled sinusoidal sequence x(n A) = sin27rp«A where p = 40 Hz, 
and the sampling rate is f s — 500 Hz (f s = 1/A). Now, sample this sequence digitally 
again, i.e. generate anew sequence x\(k A) = x[(5k)A], k = 0, 1, 2, . . . , by taking every 
five sample values of x(n A) (this has the effect of reducing the sampling rate to 100 Hz). 
Also generate a sequence Xi(kA) = x[(10k)A] by taking every 10 sample values of 
x(n A), which reduces the sampling rate to 50 Hz. Thus, aliasing occurs, i.e. p = 40 Hz 
will appear at 10 Hz. 
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Line 

MATLAB code 

Comments 

i 

clear all 

Define the sampling rate fs = 500 Hz, total record time 

2 

fs=500; T=10; 

T = 10 seconds, and the time variable t from 0 to 

3 

t=0:l/fs:T-l/fs; 

'T-l/fs’ seconds. Also generate the sampled sinusoidal 

4 

p=40; x=sin(2*pi*p*t); 

signal whose frequency is 40 Hz. 

5 

xl=x(l:5:end); 

Perform digital sampling, i.e. generate new sequences 

6 

x2=x(l:10:end); 

X\(kA) and X2(kA) as described above. 

7 

N=length(x); Nl=length(xl); 

Perform the DFT of each signal x(n A), X\(kA) and 


N2=length(x2); 

JC2(A: A), and calculate the frequency variables f, fl and 

8 

X=fft(x); Xl=fft(xl); 
X2=fft(x2); 

f2 accordingly. 

9 

f=fs*(0:N-l)/N; 

fl=100*(0:Nl-l)/Nl; 

f2=50*(0:N2-l)/N2; 


10 

figure(l); plot(f, abs(X)/fs/T) 

Plot the modulus of the DFT of x(n A) = sin 2jr(40)n A 

11 

xlabel('Frequency (Hz)'); 

for the frequency range 0 Hz to 500 Hz (up to the 


ylabel('Modulus') 

sampling rate). 

12 

axis([0 500 0 0.55]) 


13 

figure(2); 

Plot the modulus of the DFT of X\ (kA) for the 


plotffl, abs(Xl)/100/T) 

frequency range 0 Hz to 100 Hz (sampling rate of 

14 

xlabel('Frequency (Hz)'); 
ylabel('Modulus') 

Xi(kA)). 

15 

axis([0 100 0 0.55]) 


16 

figure(3); plot(f2, abs(X2)/50/T) 

Plot the modulus of the DFT of x 2 (kA) for the 

17 

xlabel('Frequency (Hz)'); 

frequency range 0 Hz to 50 Hz (sampling rate of 


ylabel('Modulus') 

x 2 (kA)). 

18 

axis([0 50 0 0.55]) 



Results 



Frequency (Hz) 

(a) DFT of x(« A) = sin 2 jt(40)wA with f s (= 1/A) = 500 Hz 
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(b)DFTof Xi(M)=x[(5*)A] (c) DFT of x 2 (iA) = x[(l(»)A] 

Comments: Note that aliasing occurs in the third case, i.e. p = 40 Hz appears at 10 Hz 
because the sampling rate is 50 Hz in this case. 


Example 5.3: Demonstration of ‘digital’ anti-aliasing filtering 

This example demonstrates a method to overcome the problem addressed in the previous 
MATLAB example. 

We use the MATLAB function ‘resample’ to avoid the aliasing problem. The 
‘resample’ function applies the digital anti-aliasing filter to the sequence before the 
sampling. 

Consider a sampled sinusoidal sequence x(?iA) = sin27rpinA + s'm2np2nA 
where p\ = 10 Hz and p 2 = 40 Hz and the sampling rate is f s = 500 Hz ( f s = 1/A). 
Generate new sequences xi(LAi) and xi^kAi) from x(n A) such that Ai/A = 5 and 
A 2 /A = 10 without causing aliasing using the ‘resample’ function. 


Line MATLAB code Comments 


1 clear all 

2 fs=500;T=10; 

3 t=0:l/fs:T-l/fs;pl=10;p2=40; 

4 x=sin(2*pi*pl*t) + sin(2*pi*p2*t); 


5 xl=resample(x,100,500); 

6 x2=resample(x,50,500); 


Define the sampling rate fs = 500 Hz, total 
record time T = 10 seconds, and the time 
variable t from 0 to ‘T-l/fs’ seconds. Also 
generate the sampled signal whose frequency 
components are 10 Hz and 40 Hz. 

Perform the ‘resampling’ as described above. 
For example, the function ‘resample(x, 100,500)’ 
takes the sequence ‘x’, applies a low-pass filter 
appropriately to the sequence, and returns the 
resampled sequence, where ‘ 100’ is the new 
sampling rate and ‘500’ is the original sampling 
rate. 


7 


N=length(x); Nl=length(xl); 
N2=length(x2); 
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8 X=fft(x);Xl=fft(xl);X2=fft(x2); 

9 f=fs*(0:N-l)/N; fl=100*(0:Nl-l)/Nl; 
f2=50*(0:N2-l)/N2; 

10 figure(l); plot(f, abs(X)/fs/T) 

11 xlabel('Frequency (Hz)'); 
ylabel('Modulus') 

12 axis([0 500 0 0.55]) 

13 figure(2); plot(fl, abs(Xl)/100/T) 

14 xlabel('Frequency (Hz)'); 
ylabel('Modulus') 

15 axis([0 100 0 0.55]) 

16 figure(3); plot(f2, abs(X2)/50/T) 

17 xlabel('Frequency (Hz)'); 
ylabel('Modulus') 

18 axis([0 50 0 0.55]) 


Exactly the same code as in the previous 
example. 


Exactly the same code as in the previous 
example. 

Note that, due to the low-pass filtering, the 
40 Hz component disappears on this graph. 


Results 



(a) DFT of x(«A) = sin 27r(10)«A + sin 2 tt( 40)«A with f s (=1/A) = 500 Hz 



(b) DFT of x^AAj) (using digital anti-aliasing filter), 
where Ai = 5 A (i.e. f = 100 Hz) 


(c) DFT of x 2 (kA 2 ) (using digital anti-aliasing filter), 
where A 2 = 10A (i.e. f s = 50 Hz) 


Comments: Note that, in Figure (c), only the 10 Hz component is shown, and the 40 Hz 
component disappears owing to the inherent low-pass (anti-aliasing) filtering process in 
the ‘resample’ function. 


6 

The Discrete Fourier Transform 


Introduction 

In this chapter we develop the properties of a fundamental tool of digital signal analysis — 
the discrete Fourier transform (DFT). This will include aspects of linear filtering, and 
relating the DFT to other Fourier representations. The chapter concludes with an intro- 
duction to the fast Fourier transform (FFT). 

6.1 SEQUENCES AND LINEAR FILTERS 
Sequences 

A sequence (or digital signal) is a function which is defined at a discrete set of points. 
A sequence results from: (i) a process which is naturally discrete such as a daily posted 
currency exchange rate, and (ii) sampling (at A second intervals (say)) an analogue signal 
as in Chapter 5. We shall denote a sequences as x(n). This is an ordered set of numbers 
as shown in Figure 6.1. 


An) 

• 

• 

• 

• 


-2 -1 0 

• 

1 

2 

3 

4 

5 * 


Figure 6.1 Example of a sequence 
Some examples are listed below: 
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(a) The unit impulse sequence or the Kronecker delta function, S(n), is defined as 

S(n) =1 if n = 0 1 
= 0 if n ± 0 J 

It can be depicted as in Figure 6.2 


( 6 . 1 ) 


Sin) 

1 . 0 " 

• • ♦ • • • — ► ii 

AS -2 -1 0l 1 2 3 

Figure 6.2 The unit impulse sequence, S(n) 


This is the digital impulse or unit sample, i.e. it is the digital equivalent of the Dirac 
delta 5(f). If the unit impulse sequence is delayed (or shifted) by k, then 


S(n — k) = 1 if n = k 1 

= 0 if n ± k J 


( 6 . 2 ) 


If k is positive the shift is k steps to the right. For example, Figure 6.3 shows the case for 
k= 2. 


S(n- 2) 

1.0 • 

• • • 1 • 1 • • • ► n 

-3 -2 -1 Ol 1 2 3 4 5 

Figure 6.3 The delayed unit impulse sequence, S(n— 2) 


(b) The unit step sequence, u(n), is defined as 

u(n) = 1 if n > 0 1 

= 0 if n < 0 | (63) 

The unit sample can be expressed by the difference of the unit step sequences, i.e. S(n) = 
u(n) — u(n — 1). Conversely, the unit step can be expressed by the running sum of the unit 
sample, i.e. u(ri) = LL-cc 


Starting with the unit sample, an arbitrary sequence can be expressed as the sum of scaled, 
delayed unit impulses. For example, consider the sequence x{n) shown in Figure 6.4, where 
the values of the sequence are denoted as a„ . 

This sequence can be written as x(n) = a-i8(n + 3) + a\8(n — 1) + a 2 $(n — 2) + 
asS(n — 5), i.e. in general form any sequence can be represented as 


x(n) = 


E 


x(k)S(n — k) 


(6.4) 
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x(n) 


«-3 • 



a 2 • 



-4 -3 

1 

,<N 
’ 1 

0 

1 2 

3 

4 5 




a x • 


a 5 » 


Figure 6.4 An arbitrary sequence, x(n) 


Linear Filters 


Discrete Linear Time (Shift) Invariant Systems " 1 ' 1 


The input-output relationship for a discrete LTI system (a digital filter) is shown in 
Figure 6.5. 


x(n) 

Input sequence 


Discrete LTI system 
h(n) 


► y(«) 

Output sequence 


Figure 6.5 A discrete LTI system 


Similar to the continuous LTI system, we define the impulse response sequence of 
the discrete LTI system as h(n). If the input to the system is a scaled and delayed impulse 
at k, i.e. x(n) = a^&in — k), then the response of the system at n is y(n) — cikh(n — k). 
So, for a general input sequence, the response at n due to input x(k) is h(n — k)x(k). Since 
any input can be expressed as the sum of scaled, delayed unit impulses as described in 
Equation (6.4), the total response y(n) to the input sequence x(n) is 

n 

y(n) = E h(n — k)x(k) if the system is causal (6.5a) 

k=—o o 

or 

oo 

y(n) = E h(n — k)x(k) if the system is non-causal (6.5b) 

k =— oo 


We shall use the latter notation (6.5b) which includes the former (6.5a) as a special 
case when h(n) = 0, if n < 0. This expression is called the convolution sum, which 
describes the relationship between the input and the output. That is, the input-output 
relationship of the digital LTI system is expressed by the convolution of two sequences 
x(n) and h(n): 

OO 

y(n) = x(n) * h(n) = h(n — k)x(k) ( 6 . 6 ) 

k =— oo 

Note that the convolution sum satisfies the property of commutativity, i.e. 


y(n) = h(n — k)x(k) = h(r)x(n — r) 


(6.7a) 


or simply 


y(n) = x(n) * h(n) = h(n) * x(n) 


(6.7b) 
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The above expressions for the convolution sum are analogous to the convolution integral 
for a continuous system (see Equations (4.51), (4.53)). An example of the convolution sum 
is demonstrated graphically in Figure 6.6. In this figure, note that the number of non-zero 
elements of sequence y(n) is ‘12’ which is one element shorter than the sum of the lengths of 
non-zero elements of sequences x(n) and h(n). 


x(„) hn '> 



0 4 0 7 

y(2)=f j h(2-k)x(k) 

k=-~ 



0 

y(n) = x{n)* h(ri) = ^ h(n - k)x(k) (not to scale) 



Figure 6.6 Illustrations of a convolution sum M61 


Relationship to Continuous Systems 

Starting from y(t) = h(t) * x(t) = f 00 ^ h(r)x(t — r)dr, consider that the signals are sam- 
pled such that y(n A) = h(n A) * x(n A). Then the approximation to the convolution integral 
becomes 

OO 

y(nA) ~ h(rA)x((n — r) A) • A (6.8) 

r=— oo 

Note the scaling factor A, i.e. if the discrete LTI system h(n) results from the sampling of the 
corresponding continuous system h(t) with sampling rate 1/ A and the input x(n) is also the 
sampled version of x(t), then it follows that 

y(« A) y(n) ■ A (6.9) 

where y(n) = h(n) * x(n). 

The concept of creating a digital filter h(n) by simply sampling the impulse response of 
an analogue filter h(t) is called ‘impulse-invariant’ filter design (see Figure 5.6 in Section 5.1). 
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Stability and Description of a Digital LTI System 

Many digital systems are characterized by difference equations (analogous to differential 
equations used for continuous systems). The input-output relationship for a digital system 
(Figure 6.5) can be expressed by 


N M 

y(n) = -^2 a ky( n ~ + X! brX ( n ~ r ) 

k = 1 r= 0 

Taking the z-transform of Equation (6.10) gives 

N M 

Z{y(n) } = Y(z) = -Y(z) J2 «kZ~ k + X(z) b r z ~ r 

k=l r = 0 


(6.10) 


(6.11) 


Note that we use the time shifting property of the ^-transform, i.e. Z{x(n — r)} = z~ r X(z), to 
obtain Equation (6. 1 1). Rearranging Equation (6. 1 1 ) gives the transfer function of the digital 
system as 


H(z) = 


YU) 

x(z) 


M 

Z b rZ~ r 


r = 0 


N 

i + E a kz~ 

k= 1 


(6.12) 


which is the z- transform of impulse response h(n). Since Equation (6.12) is a rational function, 
i.e. the ratio of two polynomials, it can be written as 


H{z) = b 0 z N - M 


U~Zi)U-Z2)---U-Zm) 

(z - pi)U - pi)---U- p N ) 


(6.13) 


Note that H(z) has M zeros (roots of the numerator) and N poles (roots of the denominator). 
From Equation (6.13), the zeros and poles characterize the system. A causal system is BIBO 
(Bounded Input/Bounded Output) stable if all its poles lie within the unit circle | z | = 1. Or 
equivalently, the digital LTI system is BIBO stable if Enl-oo \k(n)\ < oo, i.e. output sequence 
y{n) is bounded for every bounded input sequence x(n ) (Oppenheim et al., 1997). 

The system described in the form of Equation (6.10) or (6.12) is called an auto-regressive 
moving average (ARMA) system (or model) which is characterized by an output that depends 
on past and current inputs and past outputs. The numbers N, M are the orders of the auto- 
regressive and moving average components, and characterize the order with the notation (N, 
M). This ARMA model is widely used for general filter design problems (e.g. Rabiner and 
Gold, 1975; Proakis and Manolakis, 1988) and for ‘parametric’ spectral estimation (Marple, 
1987). 

If all the coefficients of the denominator are zero, i.e. =0 for all k , the system is called 
a moving average (MA) system, and has only zeros (except the stack of trivial poles at the 
origin, z = 0). Note that this system is always stable since it does not have a pole. MA systems 
always have a finite duration impulse response. If all the coefficients of the numerator are 
zero except bo, i.e. b, = 0 for k > 0, the system is called an auto-regressive (AR) system, and 
has only poles (except the stack of trivial zeros at the origin, z = 0). The AR systems have a 
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feedback nature and generally have an infinite duration impulse response. In general, ARMA 
systems also have an infinite duration impulse response. 

Sometimes, the ARMA representation of the system can be very useful, especially for 
real-time processing. For example, if the estimated impulse response sequence h(n) based 
on the methods in Chapter 9, which can be considered as an MA system, is very large, one 
can fit the corresponding FRF data to a reduced order ARMA model. This may be useful for 
some real-time digital signal processing (DSP). (See Comments 2 in MATLAB Example 9.4, 
Chapter 9.) 


6.2 FREQUENCY DOMAIN REPRESENTATION OF DISCRETE 
SYSTEMS AND SIGNALS 

Consider the response of a digital filter to a harmonic signal, i.e. x(n) — gfW". Then the 
output is 


y(n) = J2 h ( k M n ~ k )= J2 h{k)e i2nfin ~ k) 

k =— oo k =— oo 

(6.14) 

OO 

= e^f n J2 h{k)e~ i2nfk 

k=—oo 

We define H{e j2lrf ) = Y.Z-oo h(k) e - j27lfk . Then 

y(n) - e J2nfn H(e J2nf ) = x(n)H(e i2nf ) (6.15) 

H(e> 2n f) is called the frequency response function (FRF) of the system (compare this with 
Equation (4.57)). 


Consider an example. Suppose we have a discrete system whose impulse response 
is h(n) = a"u(n), \a\ < 1, as shown for example in Figure 6.7(a). Then the FRF of the 
system is 

OO OO 

H{e i2l, f) = J2 aV j2l,fn = ( ae- j2llf ) n (6. 16) 

n=0 n= 0 

This is a geometric series, and using the property of a geometric series, i.e. 


OO 


E' 

n= 0 



\r\ < 1 


Equation (6.16) can be written as 


H(e i2nf ) = 


1 

1 — ae~i 2 *f 


(6.17) 
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The modulus and phase of Equation (6.17) are shown in Figures 6.7(b) and (c), respec- 
tively. 



Figure 6.7 Example of discrete impulse response and corresponding FRF 


Note that, unlike the FRF of a continuous system, H(e p7T> ) is periodic (with period 
1, or 2n if a> is used instead of/), i.e. 


H{e ilnf ) = H(e i2 * (f+k) ) = H(e j2nf e j2nk ) = H{e ilnf ) (6.18) 

where k is integer. Note also that this is a periodic continuous function, whereas its correspond- 
ing impulse response h(n) is discrete in nature. Why should the FRF be periodic? The answer 
is that the system input is x(n) = e j2jr f " which is indistinguishable from x(n) = e F 2n f+ 2nk ') n 
and so the system reacts in the same way to both inputs. This phenomenon is very similar 
to the case of sampled sequences discussed in Chapter 5, and we shall discuss their relation 
shortly. 

Since H(e plT f ) is periodic it has a ‘Fourier series’ representation. From Equation (6. 14), 
we already have 

OO 

H(e j2nf )= Mn)e~ p7rfn (6.19) 

n=— oo 

The values h(n) are the Fourier coefficients and this expression can be inverted to give 

1/2 

h(n) = J H(e J2 * f )e J2 * fn df 
-1/2 


( 6 . 20 ) 
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Equation (6.20) is easily justified by considering the Fourier series pair given in Chapter 3, 
i.e. 


x(t) = 


OO 

J2 c n e i2nn,ITr 

n=—o o 


Cn = jr f x(t)e j27rn,/TF dt 
o 

The two expressions (6.19) and (6.20) are the basis of the Fourier representation of 
discrete signals and apply to any sequence provided that Equation (6.19) converges. Equation 
(6.19) is the Fourier transform of a sequence, and is often called the discrete-time Fourier 
transform (Oppenheim et al., 1997). However, this should not be confused with discrete 
Fourier transform (DFT) for finite length signals that will be discussed in the next section. 
Alternatives to Equations (6.19) and (6.20) are 


OO 

H{e J "‘) = J2 h(n)e- jam 

n = — oo 


(6.21) 


h(n) = ' f H{e ia> )e ia,n dm (6.22) 

27r J 

—7 r 

Note that, similar to the Fourier integral, if h(n) is real, | H(e p71 ^) | is an even and arg H{e plI f) 
is an odd function of 


The Fourier Transform of the Convolution of Two Sequences 


Let us consider an output sequence of a discrete LTI system, which is the convolution of two 
sequences h{n) andx(n), i.e. y(n) — h{n) * x{n) = h(k)x(n — k). Since the sequence 

x(n) has a Fourier representation, i.e. x(n) = X(e j27r f)e p7r f n df, substituting this into 
the convolution expression gives 


1/2 

OO OO o 

y(n) = ^ h(k)x(n — k) = ^ h(k) / X(e p,z ^)e p,zPn ^df 

k =— oo k =— oo 1/2 

1/2 


1/2 1/2 

/ oo o 

X{e j2nf ) h{k)e- j2nfk e i2nfn df = / . 
-1/2 -i/2 


j2nfk J2nfn df _ / )//«' ,2t ' )<' ,2t! '/// 


Thus, 




Y(e J27rf ) - X(e i27lf )H(e i2nf ) 


(6.23) 

(6.24) 


i.e. the Fourier transform of the convolution of two sequences is the product of their transforms. 
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Relation to Sampled Sequences, x(nA) 

If time is involved, i.e. a sequence results from sampling a continuous signal, then Equations 
(6.19) and (6.20) must be modified appropriately. For a sample sequence x(n A), the Fourier 
representations are 


X(e J2wfA ) = i(«A)C i2,/ " 4 (6.25) 

n=—o o 
1/2A 

x{n A) = A f X(e j2!zfA )e j2!zfnA df (6.26) 

— 1/2A 

which correspond to Equations (6.19) and (6.20), with A = 1. Note that we have already seen 
these equations in Chapter 5, i.e. they are the same as Equations (5.6) and (5.7) which are the 
Fourier transform pair for a ‘sampled sequence’. 


6.3 THE DISCRETE FOURIER TRANSFORM 

So far we have considered sequences that run over the range —oo < n < oo (n integer). 
For the special case where the sequence is of finite length (i.e. non-zero for a finite 
number of values) an alternative Fourier representation is possible called the discrete 
Fourier transform (DFT). 

It turns out that the DFT is a Fourier representation of a finite length sequence and 
is itself a sequence rather than a continuous function of frequency, and it corresponds to 
samples, equally spaced in frequency, of the Fourier transform of the signal. The DFT 
is fundamental to many digital signal processing algorithms (following the discovery of 
the fast Fourier transform (FFT), which is the name given to an efficient algorithm for 
the computation of the DFT). 

We start by considering the Fourier transform of a (sampled) sequence given by 
Equation (6.25). Suppose x(n A) takes some values for n = 0, 1, . . . , N— 1, i.e. Appoints 
only, and is zero elsewhere. Then this can be written as 


X(e j2nfA ) = J2 x(nA)e~ i2 * fnA 

n= 0 


(6.27) 


Note that this is still continuous in frequency. Now, let us evaluate this at frequencies 
f = k/N A where k is integer. Then, the right hand side of Equation (6.27) becomes 
Yln=o i(«A)U j|2i ^"‘, and we write this as 


N - 1 

X{k) = x{nA)e~ miN)nk (6.28) 
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This is the DFT of a finite (sampled) sequence x{n A). For more usual notation, omitting 
A, the DFT of x(n) is defined as 


iV-l 

X(k) = J2 x (n)e~ j(2w/N)nk (6.29) 

n= 0 

As a result, the relationship between the Fourier transform of a sequence and the DFT of 
a finite length sequence can be expressed as 


X(k) = 


X(e i2llfA ) evaluated at/ 



(k integer) 


(6.30) 


i.e. X(k) may be regarded as the sampled version of X{e k2lI f A ) in the frequency domain. Note 
that, since X{e k2n ^ A ) is periodic with 1/A, we may need to evaluate for /: = 0, 1, . ..,1V— 1, 
i.e. N points only. 

The inverse DFT can be found by multiplying both sides of Equation (6.29) by e T 2n / N ’> rk 
and summing over k. Then 


J2 X(k)e i(27r/N)rk = x{ri)e^ mlN)nk e i(2nlN>k = x(n)e~ j(2,l/N)k(n - r) 

k = 0 k=0 n=0 k=0 n = 0 

(6.31) 

Interchanging the summation order on the right hand side of Equation (6.31) and noting that 


N - 1 

e -j(2n/N)k(n-r) 

= N 

if n = r 

k = 0 

= 0 

otherwise 


gives Y^k= o X(k)e^ l7T / N)rk = N ■ x(r). Thus, the inverse DFT is given by 


x(n) = X(k)e j(2n/N)nk 

N k=0 


(6.33) 


Note that in Equation (6.33), since e J< 27r /wX"+W _ e j(2n/N)nk, we see t j lat both X(k) and 
x(n) are periodic with period N. It is important to realize that whilst the original sequence 
x(n) is zero for n < 0 and n > N, the act of ‘sampling in frequency’ has imposed a periodic 
structure on the sequence. In other words, the DFT of a finite length x(n) implies that x(n) is 
one period of a periodic sequence x p {n), where x(n) = x p (n) for 0 < n < N — 1 and x p {n) = 
x p (n + rN)(r integer). 

As an example, the DFT of a finite length sequence is shown in Figure 6.8 where the 
corresponding Fourier transform of a sequence is also shown for comparison. Suppose x(n) 
has the form shown in Figure 6.8(a); then Figures 6.8(b) and (c) indicate the (continuous) 
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amplitude and phase of X(e J2w ^ A ). Figures 6.8(e) and (f) are the corresponding |X(£)| and 
arg X(k) — the DFT of x(rc) (equivalently the DFT of x p (n) in Figure 6.8(d)). These correspond 
to evaluating Figures 6.8(b) and (c) at frequencies / = k/N A. Note that the periodicity is 
present in all figures except Figure 6.8(a). 


x(n) 


x{n ) = 0 for n < 0 and n > N - 1 



TtTftf »» 


\X(k)\ = \X p (k)\ 

k = N-l 

TTfttfttfTT 


N - 1 

(d) 




argX{k) = wgX (k) 




1/AA 


1/2 A 1/A 

(e) 


^mniL 


/ 


k = N-l 


1/A 

(f) 


Figure 6.8 Fourier transform of a sequence and the DFT of a finite length (or periodic) sequence 


Data Truncation M62 

We assumed above that the sequence x(n) was zero for n outside values 0 to IV — 1 . In general, 
however, signals may not be finite in duration. So, we now consider the truncated sampled 
data xr(nA). For example, consider a finite (N points) sequence ( sampled and truncated) as 
shown in Figure 6.9. 

As we would expect from the windowing effect discussed in Chapter 4, there will be some 
distortion in the frequency domain. Let x p (n) and w p (n) be the equivalent periodic sequences 
of x(nA) and win A) for 0 < n < N — 1 (omitting A for convenience). Then the DFT of the 


156 


THE DISCRETE FOURIER TRANSFORM 


x(«A) 




x T («A) = x(«A) • w(«A) 


• T 




'I 

.T. 

^ 

0 




where w(nA) - 1 
= 0 
n 


0<«< N-\ 
otherwise 


Figure 6.9 Sampled and truncated sequence Xj{n A) 


truncated signal, Xr(k), becomes 


X T (k) = DFT [x p (n)w p (nj] 


1 

(V 5 


EE* p ik l )e K1,,IN)nkl W p {k 2 )e ian/N)nk2 e 


k 2 =0 


-j(ht/N)nk 


l 

N 2 


N-l N-l N-l 


k i=0 k 2 =0 n = 0 


e -j(2n/N)n(k-k\ -k 2 ) 


1 N-l 

77 E x p( k 0W p ik - ki ) 

w *i=0 


= X p ik)®W p ik) 


(6.34) 


It is the convolution of the two periodic sequences — hence the distortion in the frequency 
domain, where the symbol © denotes circular convolution (this will be explained in Section 
6.5). The windowing effect will be demonstrated in MATLAB Example 6.2. 


Alternative Representation of the DFT 

Starting with the z-transform of x(n), i.e. X(z), then when z = g( 2ir / A a circle is picked out of 
unit radius, and 2f(e j2lr ^ A ) is the value of X(z) evaluated at points on the unit circle. When 
f = k/N A, this amounts to evaluating X{z) at specific points on the unit circle, i.e. Aevenly 
spaced points around the unit circle. This gives the DFT expression X(k) as illustrated in 
Figure 6.10. 


Im(z) 



Figure 6.10 Representation of the DFT in the z-plane 
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Frequency Resolution and Zero Padding 

As we have seen earlier in Chapter 4, the frequency resolution of Fourier transform X T (f) 
depends on the data length (or window length) T. Note that the data length of the truncated 
sampled sequence x r (« A) is T = N A, and the frequency spacing in X T (k) is 1/WA = 
1/T Hz. Thus, we may have an arbitrary fine frequency spacing when T -*■ oo. 

If the sequence x(nA) is finite in nature, then the Fourier transform of a sequence 
X(e i2 * fA ) is fully representative of the original sequence without introducing truncation, 
because 


oo N — 1 

X(e j2nfA ) = J2 x(nA)e- j27rfnA = J2 *(« A ) e ~ j2 * fnA 

n=—o o n = 0 

Then, the DFT X{k) = X(e j2lI ^ A )\ j_ k ^ NA gives the frequency spacing l/N A Hz. This 
spacing may be considered sufficient because we do not lose any information, i.e. we can 
completely recover x(nA) from X(k). 

However, we often want to see more detail in the frequency domain, such as finer fre- 
quency spacing. A convenient procedure is simply to ‘add zeros’ to x(n), i.e. define 


x(n) = xin) 0 < n < N — 1 


= 0 N < n < L - 1 


(6.35) 


Then the L-point DFT of x(n) is 


L— 1 JV-1 

X(k) = J2 x(n)e- i(2 * IL)nk = x(n) e - ja,l/L)nk (6.36) 

n = 0 n = 0 

Thus, we see that X(k) — X(ei { - 2lI l L ' >k ), k — 0, ,L — 1, i.e. ‘finer’ spacing round the unit 
circle in the --plane (see Figure 6.10), in other words, zero padding in the time domain results 
in the interpolation in the frequency domain (Smith, 2003). In vibration problems, this can be 
used to obtain the fine detail near resonances. However, care must be taken with this artificially 
made finer structure — the zero padding does not increase the ‘true’ resolution (see MATLAB 
Example 4.6 in Chapter 4), i.e. the fundamental resolution is fixed and it is only the frequency 
spacing that is reduced. 

An interesting feature is that, with zero padding in the frequency domain, performing the 
inverse DFT results in interpolation in the time domain, i.e. an increased sampling rate in the 
time domain (note that zeros are padded symmetrically with respect to N / 2, and it is assumed 
that X(N / 2) = 0 for an even number of N). So zero padding in one domain results in a finer 
structure in the other domain. 

Zero padding is sometimes useful for analysing a transient signal that dies away quickly. 
For example, if we estimate the FRF of a system using the impact testing method, the measured 
signal (from the force sensor of an impact hammer) quickly falls into the noise level. In this 
case, we can artificially improve the quality of the measured signal by replacing the data in 
the noise region with zeros (see MATLAB Example 6.7); note that the measurement time may 
also be increased (in effect) by adding more zeros. This approach can also be applied to the 
free vibration signal of a highly damped system (see MATLAB Example 6.5). 
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Scaling Effects M6 2 

If the sequence x(nA) results from sampling a continuous signal x(t) we must consider the 
scaling effect on X(k) as compared with X(f). We need to consider the scaling effect differently 
for transient signals and periodic signals. For a transient signal, the energy of the signal is 
finite. Assuming that the data window is large enough so that the truncation of data does not 
introduce a loss of energy, the only scaling factor is the sampling interval A. However, if the 
original signal is periodic the energy is infinite, so in addition to the scaling effect introduced 
by sampling, the DFT coefficients will have different amplitudes depending on the length of 
the data window. This effect can be easily justified by comparing Parseval’s theorems for a 
periodic signal (Equation (3.39)) and for a transient signal (Equation (4.17)). The following 
example shows the relationship between the Fourier integral and the DFT, together with the 
scaling effect for a periodic signal. 

Consider a periodic continuous signal x(t) = Acos2npt, p = l/7>, and its Fourier 
integral, as shown in Figure 6.11(a). Suppose we use the data length T seconds; then its 
effect is applying the rectangular window as shown in Figure 6.1 1(b). Note that the magnitude 
spectrum of W(f) depends on the window length T. The Fourier integral of the truncated signal 
is shown in Figure 6.11(c), and the Fourier transform of a truncated and sampled signal is 



(a) A periodic signal and its Fourier integral 


\W(f) | 



T T T T 


(b) Data window and its Fourier integral 


\x T (f) | 


x T (t) = w(t)x(t) 




~P 



(c) Truncated signal and its Fourier integral 


/ 


x T (nA) = w(nA) ■ x(nA ) 



\X T {e j2xf H 



f 


(d) Truncated and sampled signal and its Fourier transform of a sequence 


Figure 6.11 Various Fourier transforms of a sinusoidal signal 
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shown in Figure 6. 1 1 (d). Note the periodicity in this figure. Note also the windowing effects and 
the amplitude differences for each transform (especially the scaling factor in Figure 6.1 1(d)). 

Now consider the DFT of the truncated and sampled sequence. The DFT results in 
frequencies at /* = k/N A , k = 0, 1 , . . . , IV — 1 , i.e. the frequency range covers from 0 Hz to 
(f s — f s /N ) Hz. Thus, if we want frequency p to be picked out exactly, we need k/N A = p 
for some k. Suppose we sample at every A = T P / 10 and take one period (10-point DFT) 
exactly, i.e. T(=N A) = T P {— 1/p). As shown in Figure 6.12, the frequency separation is 
1/N A = 1/Tp = p (Hz), thus p = fi = 1/N A which is the second line on the discrete 
frequency axis (/* = k/N A). Note that the first line is fo = 0 (Hz), i.e. the d.c. component. 
All other frequencies (/*. except f\ and fg) are ‘zeros’, since these frequencies correspond to 
the zero points of the side lobes that are separated by l/T = l/7>. Thus, the resulting DFT 
is one single spike (up to k = A/2). 


\X(k)\ 


AT 




2A 

i 

\ 1 _ 1 
^m~r p j 

\ 



;'V. . 

»A/v 


0 1 JV — 1 = 9 

fs=fj2 


N = 10 

a=t p /io 

k 


AT 4(1x7),) 5A 
2 A 2(T P /10) 


Figure 6.12 The 10-point DFT of the truncated and sampled sinusoidal signal, T = T P 

Since the DFT has a periodic structure, X(10) (if it is evaluated) will be equal to X(0). 
Also, due to the symmetry property of the magnitude spectrum of X(k), the right half of 
the figure is the mirror image of the left half such that |X(1)| = | X(9)|, |2f(2)| = |X(8)|, . . . , 

| A (4) | = | X(6)[. Note that the magnitude of X(l) is 5A. Also note that X(5) is the value at the 
folding frequency /,/ 2. From the fact that we have taken an ‘even" -numbered DFT, we have 
the DFT coefficient at the folding frequency. However, if we take an ‘odd’ -numbered DFT, then 
it cannot be evaluated at the folding frequency. For example, if we take the nine-point DFT, 
the symmetric structure will become |X(1)| = |X(8)|, |X(2)| = |X(7)|, ..., |X(4)| = |X(5)| 
(see Section 6.4 and MATLAB Example 6.4). 

For the same sampling interval, if we take five periods exactly, i.e. T{= N A) = 5 T P 
(50-point DFT), then the frequency separation is 1/N A = 1/(50 • T P /10) = 1/5T P = 
p/5 (Hz) as shown in Figure 6.13. Thus, p = f; = 5/N A which is the sixth line on the 
discrete frequency axis. Again, all other frequencies /* (except f$ and fe) are ‘zeros’, since 
these frequencies also correspond to the zero points of the side lobes that are now separated by 


|A(*| 

AT ‘ 

~2A " 




p Hz 


> i 


1 _ l 
NA~57> 


,tv 
/ » 


A = 50 AT _ A(5xT f ) _ 25A 
A = 7)710 2A 2(77/10) 


V' k ilV 

N-5 N - 1 
(=45) (=49) 


k 


Figure 6.13 The 50-point DFT of the truncated and sampled sinusoidal signal, T = 57) 
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l/T = 1/5T P . Note the magnitude change at the peak frequency, which is now 25A (compare 
this with the previous case, the 10-point DFT). 

If a non-integer number of periods are taken, this will produce all non-zero frequency 
components (as we have seen in MATLAB Example 4.6 in Chapter 4; see also MATLAB 
Example 6.2b). The ‘scaling’ effect is due to both ‘sampling’ and ‘windowing’, and so different 
window types may produce different scaling effects (see MATLAB Examples 4.6 and 4.7 
in Chapter 4). Since the DFT evaluates values at frequencies /). = k/N A, the frequency 
resolution can only be improved by increasing NA (= window length, T). Thus, if the sampling 
rate is increased (i.e. smaller A is used), then we need more data (larger AO in order to maintain 
the same resolution (see Comments in MATLAB Example 6.3). 


6.4 PROPERTIES OF THE DFT 

The properties of the DFT are fundamental to signal processing. We summarize a few here: 

(a) The DFT of the Kronecker delta function 8(n) is 

JV— l 

DFT [<$(«)] = J2 8 (n)e^ K27r,N)nk = e ~ mlN ^' k = 1 (6.37) 

n = 0 

(Note that the Kronecker delta function 8{n) is analogous to its continuous counterpart, 
the Dirac delta function 8(t), but it cannot be related as the sampling of 5(f).) 

(b) Linearity: If DFT [.i(jz)l = X{k) and DFT [y(n)] = Y ( k ), then 

DFT [ax(n) + by(n)] = aX(k) + bY(k) (6.38) 

(c) Shifting property: If DFT [x (n)] = X{k), then 

DFT [x(n - n 0 )] = e~ ia7r/N)n ° k X(k) (6.39) 

Special attention must be given to the meaning of a time shift of a finite duration sequence. 
Shown in Figure 6.14 is the finite sequence x(n) of duration N samples (marked •). The 
A-point DFT of x(n) is X{k). Also shown are the samples of the ‘equivalent’ periodic 
sequence x p {n) with the same DFT as x(n). 

If we want the DFT of x(n — no), no < N, we must consider a shift of the periodic se- 
quence x p {n — no) and the equivalent finite duration sequence with DFT e~^ 2lz l N ' > "° k X{k) 
is that part of x p {n — no) in the interval 0 < n < TV — 1, as shown in Figure 6.15 for 
n o = 2 (for example), i.e. shift to right. 
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Figure 6.14 Finite sequence x(n) and equivalent periodic sequence x p (n) 
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Figure 6.15 Shifted finite sequence x(n — n a ) and equivalent shifted periodic sequence x p (n — n 0 ) 


Examining Figures 6.14 and 6.15, we might imagine the sequence x(n) as displayed 
around the circumference of a cylinder in such a way that the cylinder has N points on it. 
As the cylinder revolves we see x p (n), i.e. we can talk of a ‘circular" shift. 

(d) Symmetry properties M6 4 : For real data x(ri), we have the following symmetry properties. 
An example is shown in Figure 6.16 (compare the symmetric structures for even and odd 
numbers of N). Note that, at A/2, the imaginary part must be ‘zero’, and the phase can 


be either ‘zero or n' depending on the sign of real part: 

Re [X(k)\ = Re [X(N - it)] (6.40a) 

Im [X(k)] = -Im [X(N - k)] (6.40b) 

\X(k)\ = |X(A7 - Jk)| (6.41a) 

arg X(k) = — arg X(N — k ) (6.41b) 


Or, we may express the above results as (*denotes complex conjugate) 


X(N - k) = X*(k) 


(6.42) 


\X(k)\ \X(k)\ 



arg X(k) arg X(k) 



Figure 6.16 Symmetry properties of the DFT 
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6.5 CONVOLUTION OF PERIODIC SEQUENCES” 6 6 


Consider two periodic sequences with the same length of period, x p (n) and h p (n), and their 
DFTs as follows: 

tv-i 

X p (k) = J2 x p {n)e-> an l N)nk (6.43a) 

n= 0 


Hp{k) = y "]/ip(n)e~ 


j(2jt/N)nk 


(6.43b) 


Then, similar to the property of Fourier transforms, the DFT of the convolution of two periodic 
sequences is the product of their DFTs, i.e. DFT \y p {n) = x p (n) * h p {n)\ is 


YJk) = XJk)HJk ) 


(6.44) 


The proof of this is given below: 

~N - 1 

Y p {k) = DFT [X p (n) * h p {nj\ = DFT ^ x p (r)h p {n - r) 

_r = 0 

= ^ x p( r '> h p( n - r)e^ A27l/N) ' ,k 

77=0 7 =0 

= J2 x P^ J2 h P {n ~ r)e- ii2n/N * n - r)k e- i(2,l/N)rk 

r= 0 7i=0 

= J2 x p ir)e- j ^' N)rk ■ h p (n - r)e -J^/N)(„- r) k = Xp{k) . Hp(k) (6 45) 


This is important — so we consider its interpretation carefully. y p (n) is called a circular 
convolution, or sometimes a periodic convolution. Let us look at the result of convolving two 
periodic sequences in Figure 6.17. 

Now, from y p (n) = x p {n) * h p (n) = x p( r )hp{n — r), we draw the sequences in 

question as functions of r. To draw h p (n — r), we first draw h p (—r), i.e. we ‘reverse’ the 
sequence h p (r) and then move it n places to the right. For example, h p (0 — r), h p ( 2 — r) and 
x p (r) are as shown in Figure 6.18. 




N = 5 


Ml 


T I 1 .. HT . , 


-5 -10 12 3 4 


h p {n) 

N = 5 

»• • • •• 

: | ! i 

-5 -10 12 3 4 


One period 

Figure 6.17 Two periodic sequences x p (ri) and h p (n ) 
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M6.6 


x p (r) and h p (0 - r ) x ( r ) and h p (2 - r) 



Figure 6.18 Illustration of the circular convolution process 

As n varies, h p (n — r) slides over x p (r) and it may be seen that the result of the convolution 
is the same for n = 0 as it is for n = N and so on, i.e. y p (r) is periodic - hence the term circular 
or periodic convolution. The resulting convolution is shown in Figure 6.19. 


y p (n) = x p (n)*h p (n) 



Figure 6.19 Resulting sequence of the convolution of x p (n) and h p {n) 


Often the symbol © is used to denote circular convolution to distinguish it from linear 
convolution. Let us consider another simple example of circular convolution. Suppose we have 
two finite sequences x{n) = [1, 3. 4] and h(n) = [1, 2, 3]. Then the values of the circular 
convolution y(n) = x(n)®h(n) are 


2 


y(0) = y^x(r)A(0-r)= 18, 

r=0 

where h(0 — r) = [1, 3, 2] 


2 

y(l) = ^x(r)/i(l - r) = 17, 

r= 0 

where /z ( 1 — r) = [2, 1, 3] 

(6.46) 

2 

y(2) = J2x(r)h(2-r)= 13, 

r=0 

where h{2 — r) = [3, 2, 1] 


Note that y(3) = y(0) and h( 3 — r) = h( 0 — r) if they are to be evaluated. 

If we are working with finite duration sequences, say x(n) and /;(«), and then take DFTs 


of these, there are then ‘equivalent’ periodic sequences with the same DFTs, i.e. X p (k) = 
X(k) and H p {k) — H(k). If we form the inverse DFT (IDFT) of the product of these, i.e. 
IDFT [ H p (k)X p (k) ] or IDFT [H(k)X(k)], then the result will be circular convolution of the 
two finite sequences: 


x(n)®h(n) = IDFT [X{k)H{k)] 


(6.47) 


Sometimes, we may wish to form the linear convolution of the two sequences as discussed 
in Section 6.1. Consider two finite sequences x(n) and h{n), where n = 0, 1, . . . , IV — 1 as 
shown in Figures 6.20(a) and (b). Note that these are the same sequences as in Figure 6.17, 
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but their lengths are now only one period. The linear convolution of these two sequences, 
y(n) = x{n) * h(n), results in a sequence with nine points as shown in Figure 6.20(c). 


x(n),n = 0,l,...,iV— 1 A(n), n = 0,1,...,W-1 y(«), « = 0,1,...,L-1 

L=2N -1=9 


1 

II 

LAl 

• 

• 

II 


Tt 

1 

-L. - 

• • 

— — : :: ► n — i 

X 



0 12 3 4 
(a) 


0 12 3 4 
(b) 


012345671 

(c) 


Figure 6.20 Linear convolution of two finite sequences 


The question is: can we do it using DFTs? (We might wish to do this because the FFT 
offers a procedure that could be quicker than direct convolution.) 

We can do this using DFTs once we recognize that the y(n') may be regarded as one period 
of a periodic sequence of period 9. To get this periodic sequence we add zeros to x(n) and 
h(n) to make x(n) and h(n) of length 9 (as shown in Figures 6.21(a) and (b)), and form the 
nine-point DFT of each. Then we take the IDFT of the product to get the required convolution, 
i.e. x(n)®h(n) = IDFT [X(k)H(k)]. The result of this approach is shown in Figure 6.21(c) 
which is the same as Figure 6.20(c). 


x(n) h(n) y(n)=m?T[X(k)H(k)] 



ii 

.* N = 9 


Os 

II 

• 

i 

I_, J 


X 




012345678 012345678 012345678 

(a) (b) (c) 


Figure 6.21 Linear convolution of two finite sequences using the DFT 

More generally, suppose we wish to convolve two sequences x(n) and h(n ) of length 
N i and Ns, respectively. The linear convolution of these two sequences is a sequence y(n) of 
length iVi + IV 2 — 1 . To obtain this sequence from a circular convolution we require x(n) and 
h{n) to be sequences of N\ + N 2 — 1 points, which is achieved by simply adding zeros to x(n) 
and h(n) appropriately. Then we take the DFTs of these augmented sequences, multiply them 
together and take the IDFT of the product. A single period of the resulting sequence is the 
required convolution. (The extra zeros on x (n) and h(n) eliminate the ‘wrap-around" effect.) 
This process is called fast convolution. Note that the number of zeros added must ensure that 
x(n) and h(n) are of length greater than or equal to Ni + IV 2 — 1 and both the same length. 


6.6 THE FAST FOURIER TRANSFORM 

A set of algorithms known as the fast Fourier transform (FFT) has been developed to reduce 
the computation time required to evaluate the DFT coefficients. The FFT algorithm was 
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rediscovered by Cooley and Tukey (1965) — the same algorithm had been used by the German 
mathematician Karl Friedrich Gauss around 1805 to interpolate the trajectories of asteroids. 
Owing to the high computational efficiency of the FFT, so-called real-time signal processing 
became possible. This section briefly introduces the basic ‘decimation in time’ method for 
a radix 2 FFT. For more details of FFT algorithms, see various references (Oppenheim and 
Schafer, 1975; Rabiner and Gold, 1975; Duhamel and Vetterli, 1990). 


The Radix 2 FFT 

Since the DFT of a sequence is defined by X(k) = J2n=o x(n)e~ j(2,r ^ N)nk , k = 0, 1, . . . , 
N — 1, by defining Wn = the DFT can be rewritten as 

N - 1 

X{k) = J2 x(n)Wf (6.48) 

n—0 


It is this expression that we shall consider. Note that Wff is periodic with period N (in both 
k and n), and the subscript N denotes the periodicity. The number of multiply and add operations 
to calculate the DFT directly is approximately N 2 , so we need more efficient algorithms to 
accomplish this. The FFT algorithms use the periodicity and symmetry property of Wff, and 
reduce the number of operations N 2 to approximately N log, N (e.g. if TV = 1024 the number 
of operations is reduced by a factor of about 100). 

In particular, we shall consider the case of N to be the power of two, i.e. N — 2 v . This leads 
to the base 2 or radix 2 algorithm. The basic principle of the algorithm is that of decomposing 
the computation of a DFT of length N into successively smaller DFTs. This may be done in 
many ways, but we shall look at the decimation in time (DIT) method. 

The name indicates that the sequence x(n) is successively decomposed into smaller sub- 
sequences. We take a general sequence x(n) and define x\(n), x 2 (n) as sequences with half the 
number of points and with 


N 

X\ (n) = x(2 n), n = 0, 1, . . . , — — 

N 

X 2 (n) = x(2n + 1). n = 0, 1 — 

Then 


i.e. even number of x(n) 

1, i.e. odd number of x(n) 


N- 1 JV-1 JV-1 

X(k) = Y,xtn)W n N k = J2 x(n)W n N k + ]T x(n)W nk 

n= 0 n= 0 n= 1 

(even) (odd) 

N/ 2-1 N/2—l 

= J2 x(2n)W 2nk + J2 x{2n + \)W {2n+l)k 

n = 0 n= 0 


(6.49a) 

(6.49b) 


(6.50) 


Noting that = [e h 2 ’ T / N ')] 2 — e /[ 2 jt/(w/ 2 )] _ \y N ^ Equation (6.50) can be written as 

At/2-1 AT/2-1 

X(k) = J2 xtinjWtn + K E *2(.n)W$2 

n=0 n=0 


(6.51) 
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i.e. 

X(k) - Xi(k) +W k X 2 {k) (6.52) 

where X\{k) and2(2(k) are V/2-point DFTs ofxRn) and xi(n). Note that, since X(k) is defined 
for 0 < k < N — 1 and X\(k), X 2 (k) are periodic with period N / 2, then 

/ N\ , / N\ N 

X{k) = X l \k--\FW k N X 2 \k--\ — < k < N — l (6.53) 

The above Equations (6.52) and (6.53) can be used to develop the computational procedure. 
For example, if N = 8 it can be shown that two four-pont DFTs are needed to make up the 
full eight-point DFT. Now we do the same to the four-point DFT, i.e. divide xi(n) and x 2 {n) 
each into two sequences of even and odd numbers, e.g. 

Xtik) = A(k) + W k N/2 B(k) = A(k) + WfB(k ) for 0 < k < y — 1 (6.54) 

where A(k) is a two-point DFT of even numbers of xi(n), and B{k) is a two-point DFT of odd 
numbers of xi(rc). This results in four two-pont DFTs in total. Thus, finally, we only need to 
compute two-point DFTs. 

In general, the total number of multiply and add operations is Vlog 2 N. Finally, we 
compare the number of operations V 2 (DFT) versus N log 2 N (FFT) in Table 6. 1 . 


Table 6.1 Number of multiply and add operations, FFT versus DFT 


N 

N 1 (DFT) 

N log 2 N (FFT) 

N 2 /(N log 2 N) 

16 

256 

64 

4.0 

512 

262 144 

4608 

56.9 

2048 

4194 304 

22 528 

186.2 


6.7 BRIEF SUMMARY 


1 . The input-output relationship of a digital LTI system is expressed by the convolution 
of two sequences of h(n) and x(n), i.e. 


oo oo 

y(n) = ^ h(n — k)x(k) = ^ h(r)x(n — r) or 

k = — oo r =— oo 

y{n) = x{n) * h(n) = h{n) * x(n) 

The Fourier transform of the sequence h(n ), H{e 22lr f), is called the system frequency 
response function (FRF), where 


OO 

H{e i2nf )= J2 Kn)e- i2nfn 

n =— oo 


and 


1/2 

h(n) = J 
- 1/2 


H(e ]27,f )e j27,fn df 


Note that T/(e-' 27I 7 ) is continuous and periodic in frequency. 
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2. The Fourier transform of the convolution of two sequences is the product of their 
transforms, i.e. 

F {y(n) - x(n) * h(n)\ = Y(e j27rf ) = X(e j2nf )H(e j2nf ) 


3. The DFT pair for a finite (or periodic) sequence is 


(n) — X (k)e^ 2lz ^ N>nk and X{k) = x{n)e J 


j(2w/N)nk 


Note that the IV-point DFT of a finite length sequence x(n) imposes a periodic structure 
on the sequence. 

4. Frequency spacing in X(k) can be increased by adding zeros to the end of sequence 
x(n). However, care must be taken since this is not a ‘true’ improvement in resolution 
(ability to distinguish closely spaced frequency components). 

5. The relationship between the DFT X(k) and the Fourier transform of a (sampled) 
sequence X(e J2jI ^ A ) is 


X(k) = 


X(e J2xfA ) 


evaluated at f = Hz 

J N A 


Note that this sampling in frequency imposes the periodicity in the time domain (as 
does the sampling in the time domain which results in periodicity in the frequency 
domain). 

6. If a signal is sampled and truncated, we must consider the windowing effect (distor- 
tion in the frequency domain) and the scaling factor as compared with the Fourier 
transform of the original signal. 

7. Symmetry properties of the DFT are given by 


X(N -k) = X*(k) 


8. The circular convolution of two finite sequences can be obtained by the inverse DFT 
of the product of their DFTs, i.e. 

x(n)®h(n) = IDFT \X(k)H(k)] 

The linear convolution of these two sequences, y(n) — x{n) * h(n), can also be ob- 
tained via the DFT by adding zeros to x(n) and h(n) appropriately. 

9. The fast Fourier transform (FFT) is an efficient algorithm for the computation of the 
DFT (the same algorithm can be used to compute the inverse DFT). There are many 
FFT algorithms. There used to be a restriction of data length N to be a power of two, 
but there are algorithms available that do not have this restriction these days (see 
FFTW, http://www.fftw.org). 

10. Finally, we summarize the various Fourier transforms in Figure 6.22 (we follow 
the display method given by Randall, 1987) and the pictorial interpretation of 
the DFT of a sampled and truncated signal is given in Figure 6.23 (see Brigham, 
1988). 
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40 = X C n ( 


,j2xnt/T p 


c„ = -]x(t)e- j2 ^ T 'dt 



Fourier integral 


40= j X(f)e J1 * ft df 

x(t). 

Continuous 

X(f )= ] x(t)e~ i2 * ft dt 


\ Continuous 


Fourier transform of a (sampled) sequence 


x(nA) = A J X(e i2,rfA )e i2 ’ rfi ' A df 

—1/2 A 

X(e J2 * fA )= £ x(nA) e - J2xJhA 


x(«A) ik 

Discrete 


[t a 


ImtTtt..... ^ 


t or 
index n 


Continuous, periodic 



x(n) = \yX(kyW N)nk 

M k=0 


-1/A -1/2 A 1/2 A 1/A 3/2 A 

Discrete Fourier transform (DFT) 

x(n) 


1 1 T T ? y ,¥ a, 


Discrete, periodic 

llTTtttti 


\X{k)\ 


X(k) = Y J x(n)e 


-j(27r/N)nk 


N - 1 

Discrete, periodic 


t or 
index n 




llTTttwTnrTlI 



iTtrwwTt , f or 

index k 


Figure 6.22 Summary of various Fourier transforms 
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Fourier transform of truncated signal 


x T (t) = x(t)w(t) 
w(t) is a data window 


X r (/) = F{ % (?)} 



Fourier transform of sampled, truncated signal 


x T (?) is sampled 

l r (nA)- 

t -*1 1*- a 

every A seconds 


T\ kin . 




X T {e i2 " fA ) = F{x T (nA)} 


-1/2A 1/2 A 1/A 



DFT of sampled, truncated signal 


DFT imposes periodicity 
in the time domain 


X(k) = DFT [*(«)], 

X T (e l2,I,A ) is sampled 
every 1/lVAHz 



Figure 6.23 Pictorial interpretations (from the Fourier integral to the DFT) 
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6.8 MATLAB EXAMPLES 


Example 6.1: Example of convolution (see Figure 6.6) 

In this example, we demonstrate the convolution sum, y(n) = x(n) * h(n), and its com- 
mutative property, i.e. x{n) * h(n) = h(n) * x(n). 


Line MATLAB code Comments 


1 clear all 

2 x=[l 1 1 1 1 0000]; 

3 h=[8 765432100 0]; 

4 nx=[0:length(x)-l]; 

5 nh=[0:length(h)-l]; 


6 yl=conv(h,x); 

7 y2=conv(x,h); 

8 ny=[0:length(yl)-l]; 


9 figure(l); stem(nx,x, 'd', 'filled') 

10 xlabel('\itn'); ylabel('\itx\rm(\itn\rm)') 

1 1 figure(2); stem(nh,h, 'filled') 

12 xlabel('\itn'); ylabel('\ith\rm(\itn\rm)') 

13 figure(3); stem(ny,yl, 'filled') 

14 xlabel('\itn'); ylabel('\ity_l\rm(\itn\rm)') 

15 figure(4); stem(ny,y2, 'filled') 

16 xlabel('\itn'); ylabel('\ity_2\rm(\itn\rm)') 


Define a sequence x(n) whose total length is 
9, but the length of non-zero elements is 5. 
Also define a sequence h(n ) whose total 
length is 1 1 , but the length of non-zero 
elements is 8. And define indices for x(n) 
and h(n). 

Note that MATLAB uses the index from 1, 
whereas we define the sequence from n = 0. 

Perform the convolution sum using the 
MATLAB function ‘conv’, where 
yi(n ) = h(n ) * x(n) and 
y 2 (n) = x(n) * h(n). 

Both will give the same results. Note that 
the length of ‘conv(h,x)’ is ‘length(h) + 
length(x) — 1’ . And define the index for both 
yi(n) and y 2 60- 

Plot the sequences x(n), h(n), yi(n) and 
yi(n). 

Note that yi(n) and y 2 (n) are the same, the 
total length of yi(n) is 19, which is ‘11 + 9 
— 1 ’, and the length of the non-zero 
elements is 12, which is ‘8+ 5 — 1’. 


Results 



(a) 



(b) 
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(c) 


Example 6.2a: DFT of a sinusoidal signal 

Case A: Truncated exact number of periods, (see Figures 6.12 and 6.13). 

Consider a sinusoidal signal x(t) = A sin2jrpt, p = l/Tp Hz. Sample this signal at the 
sampling rate f s = 10/7> Hz. We examine two cases: (i) data are truncated at exactly 
one period (10-point DFT), (ii) data are truncated at exactly five periods (50-point DFT). 
For this example, we use A = 2 and p = 1 Hz. Note that the Fourier integral gives the 
value A/2 = 1 at p Hz. 


Line MATLAB code Comments 


1 clear all 

2 A=2; p=l; Tp=l/p; fs=10/Tp; 

3 Tl=l*Tp;T2=5*Tp; 

4 tl=[0:l/fs:Tl-l/fs]; 

5 t2= [0: l/fs:T2- 1/fs] ; 

6 xl=A*cos(2*pi*p*tl); 

7 x2=A*cos(2*pi*p*t2); 

8 Xl=fft(xl); X2=fft(x2); 

9 Nl=length(xl); N2=length(x2); 

10 fl=fs*(0:Nl-l)/Nl; 
f2=fs*(0:N2-l)/N2; 

11 figure(l) 

12 stem(fl, abs(Xl), 'fill') 

13 xlabel('Frequency (Hz)') 

14 ylabel('Modulus of \itX\rm(\itk\rm)'); 
axis ([0 9.9 0 10]) 


Define parameters and the sampling rate fs 
such that 10 samples per period Tp. Truncate 
the data exactly one period (Tl) and five 
periods (T2). Define time variables tl and t2 
for each case. 

Generate the sampled and truncated signals xl 
(one period) and x2 (five periods). Perform the 
DFT of each signal. 

Calculate the frequency variables f 1 and f2 for 
each case. 

Plot the results (modulus) of 10-point DFT. 
Note the frequency range 0 to 9 Hz 
(f s — f s /N) and the peak amplitude 
AT /2A = 5A = 10 (see Figure 6.12). Since 
exact number of period is taken for DFT, all 
the frequency components except p = 1 Hz 
(and 9 Hz, which is the mirror image of p Hz) 


are zero. 
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15 figure(2) 

16 stem(fl, abs(Xl)/fs/Tl, 'fill'); % this is the 
same as stem(fl, abs(Xl)/Nl, 'fill') 

17 xlabel('Frequency (Hz)') 

18 ylabel('Modulus (scaled)'); 
axis ([0 9.9 0 1]) 


19 figure(3) 

20 stem(f2, abs(X2), 'fill') 

21 xlabel('Frequency (Hz)') 

22 ylabel('Modulus of \itX\rm(\itk\rm)') 


23 figure(4) 

24 stem(f2, abs(X2)/fs/T2, 'fill'); % this is the 

same as stem(f2, abs(X2)/N2, 'fill') 

25 xlabel('Frequency (Hz)'); 
ylabel('Modulus (scaled)') 


This plots the same results, but now the 
DFT coefficients are scaled appropriately. 
Note that the modulus of X(k ) is divided 
by the sampling rate (fs) and window 
length (Tl). Note that it also gives 
the same scaling effect if X(k ) is divided 
by the number of points N 1 . The result 
corresponds to the Fourier integral, i.e. the 
peak amplitude is now A/2 = 1 at p Hz. 

Plot the results (modulus) of 50-point 
DFT. Note that the peak amplitude is 
AT /2A = 25A = 50. In this case, we 
used the data five times longer in ‘time’ 
than in the previous case. This results in 
an increase of frequency resolution, i.e. 
the resolution is increased five times that 
in the previous case. 

This plots the same results, but, as before, 
the DFT coefficients are scaled 
appropriately, thus A/ 2=1 at p Hz. 


Results 



i 

0.9 
0.8 

73 0.6 
i-r o.5 

1 04 

I 0.3 

0.2 
0.1 
0< 

0123456789 
Frequency (Hz) 


iV = 10 ( a2 ) 

Magnitude is scaled appropriately 


1 

0.9 
0.8 

If 0 ' 7 
13 0.6 

"g 0.5 
^ 0.4 
2 0.3 
0.2 
0.1 
0< 

0123456789 10 0123456789 10 

Frequency (Hz) Frequency (Hz) 


40 

r 35 • 


25 • 
20 ■ 


(a3) 


p = l Hz N/2,(f,/2 ) 

L 1 


N-i,(f s -f s /N) 

\ 





MATLAB EXAMPLES 


173 


Comment: In this example, we applied the scaling factor l/(f s T) = 1 /N to X(k) to 
relate its amplitude to the corresponding Fourier integral X{f). However, this is only true 
for periodic signals which have discrete spectra in the frequency domain. In fact, using 
the DFT, we have computed the Fourier coefficients (amplitudes of specific frequency 
components) for a periodic signal, i.e. Ck Xk/N (see Equation (3.45) in Chapter 3). 

For transient signals, since we compute the amplitude density rather than ampli- 
tude at a specific frequency, the correct scaling factor is 1 / f s or A (assuming that the 
rectangular window is used), although there is some distortion in the frequency domain 
due to the windowing effect. The only exception of this scaling factor may be the delta 
function. Note that S(n) is not the result of sampling the Dirac delta function S(t) which 
is a mathematical idealization. 


Example 6.2b: DFT of a sinusoidal signal 

Case B: Truncated with a non-integer number of periods. (See also the windowing effect 
in Sections 4.11 and 3.6.) 

We use the same signal as in MATLAB Example 6.2a, i.e. x(t) = As'm2npt, 
p = l/Tp Hz, f. = 10/Tp Hz, A = 2, and p = 1 Hz. However, we truncate the data in 
two cases: (i) data are truncated one and a half periods (15-point DFT), (ii) data are trun- 
cated three and a half periods (35-point DFT). Note that we use an odd number for the DFT. 


Line 


MATLAB code 


Comments 


1 clear all 

2 A=2;p=l;Tp=l/p;fs=10/Tp; 

3 Tl=1.5*Tp; T2=3.5*Tp; 

4 tl =[0: l/fs:Tl- 1/fs] ; 
t2=[0:l/fs: T2-l/fs]; 

5 xl=A*cos(2*pi*p*tl); x2=A*cos(2*pi*p*t2); 

6 Xl=fft(xl);X2=fft(x2); 

7 Nl=length(xl); N2=length(x2); 

8 fl=fs*(0:Nl-l)/Nl; 
f2=fs*(0:N2-l)/N2; 

9 Xlz=fft([xl zeros(l,5000-Nl)]); % zero padding 

10 X2z=fft([x2zeros(l,5000-N2)]); % zero padding 

11 Nz=length(Xlz); 

12 fz=fs*(0:Nz-l)/Nz; 

13 figure) 1) 

14 stem(fl, abs(Xl)/fs/Tl, 'fill'); 
hold on 

15 plot(fz, abs(Xlz)/fs/Tl, 'r:'); hold off 

16 xlabel('Frequency (Hz)'); 
ylabelfModulus (scaled)') 

17 axis([0 10 0 1.02]) 


Exactly the same as previous example 
(Case A), except Tl is one and a half 
periods of the signal and T2 is three and 
a half periods of the signal. 


Perform 5000-point DFT by adding 
zeros at the end of each sequence xl 
and x2, i.e. ‘zero padding’ is applied for 
demonstration purpose. Calculate new 
frequency variable accordingly. 

Plot the results (modulus) of 15-point 
DFT (stem plot) and DFT with zero 
padding (dashed line). Magnitudes of 
DFT coefficients are scaled 
appropriately. Examine the effect of 
windowing in this figure. Note the 
change of magnitude at the peak 
(compare this with the previous 
example). Also, note that we do not have 
the value at the frequency p = 1 Hz. 
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1 8 figure(2) 

19 stem(f2, abs(X2)/fs/T2, 'fill'); hold on 

20 plot(fz, abs(X2z)/fs/T2, 'r:'); hold off 

21 xlabel('Frequency (Hz)'); 
ylabel('Modulus (scaled)') 

22 axis([0 10 0 1.02]) 


Plot the results (modulus) of 35-point 
DFT (stem plot) and DFT with zero 
padding (dashed line). Note that the 
resolution is improved, but there is still 
a significant amount of smearing and 
leakage due to windowing. Again, we 
do not have the DFT coefficient at the 
frequency p = 1 Hz. 


Results 



Example 6.3: DFT of a sinusoidal signal 

Increase of sampling rate does not improve the frequency resolution; it only increases 
the frequency range to be computed (with a possible benefit of avoiding aliasing, see 
aliasing in Chapter 5). 

We use the same signal as in the previous MATLAB example, i.e. x(t) — A sin 2npt, 
p = l/Tp Hz, A = 2 and p = 1 Hz. However, the sampling rate is increased twice, i.e. 
f s = 20/TpHz. We examine two cases: (a) data length T = T P (20-point DFT; this 
corresponds to the first case of MATLAB Example 6.2a), (b) data length T = 1.5 Tp 
(30-point DFT; this corresponds to the first case of MATLAB Example 6.2b). 


Line MATLAB code Comments 


1 clear all 

2 A=2; p=l; Tp=l/p; fs=20/Tp; 

3 Tl=l*Tp; T2=1.5*Tp; 

4 tl = [0: 1/fs :T1 - 1/fs] ; t2= [0: 1/fs :T2- 1/fs] ; 

5 xl=A*cos(2*pi*p*tl); 
x2=A*cos(2*pi*p*t2); 

6 Xl=fft(xl); X2=fft(x2); 

7 Nl=length(xl); N2=length(x2); 

8 fl =fs*(0:Nl-l)/Nl; f2=fs*(0:N2-l)/N2; 


Exactly the same as previous examples 
(MATLAB Examples 6.2a and 6.2b), 
except that the sampling rate fs is now 
doubled. 
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9 figure(l) 

10 stem(fl, abs(Xl)/fs/Tl, 'fill') 

1 1 xlabeK'Frequency (Hz)'); 

ylabel('Modulus (scaled)') 

12 axis([0 20 0 1 ] ) 


1 3 figure(2) 

14 stem(f2, abs(X2)/fs/T2, 'fill') 

15 xlabeK'Frequency (Hz)'); 

ylabel('Modulus (scaled)') 

16 axis([0 20 0 1 ] ) 


Plot the results (modulus) of 20-point DFT 
(i.e. for the case of T = T P ). Note that the 
frequency spacing is 1 Hz which is exactly 
the same as MATLAB Example 6.2a 
( when N = 10), and the folding frequency 
is now 10 Hz (5 Hz in the previous 
example). 

Plot the results (modulus) of 30-point DFT 
(i.e. for the case of T = 1.5 T P ). Again, the 
result is the same as MATLAB Example 
6.2b (when N = 15), within the frequency 
range 0 to 5 Hz. 


Results 


This region is exactly the same as 
previous example (Example 6.2a, 
see Figure (a2)) 

/ 


8 10 12 14 16 18 20 

Frequency (Hz) 



(a) f M = 20/Tp , N = 20 (i.e. T = T P ) 


(b) f s = 20/T p , N = 30 (i.e. r = 1.57» 


Comments: Compare these results with the previous examples (MATLAB Example 6.2a, 
6.2b). Recall that the only way of increasing frequency resolution is by increasing data 
length (in time). Note that, since the sampling rate is doubled, double the amount of data 
is needed over the previous example in order to get the same frequency resolution. 


Example 6.4: Symmetry properties of DFT (see Section 6.4) 

Consider a discrete sequence x(n) = a n u(n ), 0 < a < 1, n = 0, 1, . . . , N — 1. In this 
example, we use a = 0.3 and examine the symmetry properties of the DFT for two 
cases: (a) A is an odd number ( N = 9), and (b) A is an even number ( N = 10). 


Line MATLAB code 


1 clear all 

2 a=0.3; 

3 n 1=0:8; % 9-point sequence 

4 n2=0:9; % 10-point sequence 


Comments 




Define the parameter a, and variables nl (for 
the odd-numbered sequence) and n2 (for the 
even-numbered sequence). 
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5 

xl=a.~nl; x2=a."n2; 

6 

Xl=fft(xl); X2=fft(x2); 

7 

figure(l) 

8 

subplot(2,2,l); 
stem(nl, real(Xl), 'fill') 

9 

axis([-0.5 8.5 0 1.6]) 

10 

xlabel('\itk'); 

ylabel('Re[\itX\rm(\itk\rm)]') 

11 

subplot(2,2,2); 
stem(nl, imag(Xl), 'fill') 

12 

axis([-0.5 8.5 -0.4 0.4]) 

13 

xlabel('\itk'); 

ylabel('Im[\itX\rm(\itk\rm)]') 

14 

subplot(2,2,3); 
stem(nl, abs(Xl), 'fill') 

15 

axis([-0.5 8.5 0 1.6]) 

16 

xlabel('\itk'); 

ylabel('l\itX\rm(\itk\rm)|') 

17 

subplot(2,2,4); 
stem(nl, angle(Xl), 'fill') 

18 

axis([-0.5 8.5 -0.4 0.4]) 

19 

xlabel('\itk'); 

ylabel('arg\itX\rm(\itk\rm)') 

20 

figure(2) 

21 

subplot(2,2,l); 
stem(n2, real(X2), 'fill') 

22 

axis([-0.5 9.5 0 1.6]) 

23 

xlabel('\itk'); 

ylabel('Re[\itX\rm(\itk\rm)]') 

24 

subplot(2,2,2); 
stem(n2, imag(X2), 'fill') 

25 

axis([-0.5 9.5 -0.4 0.4]) 

26 

xlabel('\itk'); 

ylabel('Im[\itX\rm(\itk\rm)]') 

27 

subplot(2,2,3); 
stem(n2, abs(X2), 'fill') 

28 

axis([-0.5 9.5 0 1.6]) 

29 

xlabel('\itk'); 

ylabel('|\itX\rm(\itk\rm)|') 

30 

subplot(2,2,4); 
stem(n2, angle(X2), 'fill') 

31 

axis([-0.5 9.5 -0.4 0.4]) 

32 

xlabel('\itk'); 

ylabel('arg\itX\rm(\itk\rm)') 


Create two sequences xl and x2 according to 
the above equation, i.e. x(n) = a n u(n). 
Perform the DFT of each sequence, i.e. 

X(k) = DFT [*(«)]. 

Plot the real part of the DFT of the first 
sequence xl. The MATLAB command 
‘subplot(2,2,l)’ divides the figure(l) into four 
sections (2x2) and allocates the subsequent 
graph to the first section. 


Plot the imaginary part of the DFT of the first 
sequence xl. Note that, since A/ 2 is not an 
integer number, we cannot evaluate the DFT 
coefficient for this number. Thus, the 
zero-crossing point cannot be shown in the 
figure. 

Plot the modulus of the DFT of the first 
sequence xl. 


Plot the phase of the DFT of the first sequence 
xl. Similar to the imaginary part of the DFT, 
there is no zero-crossing point (or 7r) in the 
figure. 

Plot the real part of the DFT of the second 
sequence x2. 


Plot the imaginary part of the DFT of the 
second sequence x2. Since A/2 is an integer 
number, we can evaluate the DFT coefficient 
for this number. Note that the value is zero at 
n = A/2. 

Plot the modulus of the DFT of the second 
sequence x2. 


Plot the phase of the DFT of the second 
sequence x2. Similar to the imaginary part of 
the DFT, there is a zero-crossing point at 
n = A/2. (The value is zero because the real 
part is positive. If the real part is negative the 
value will be n .) 


MATLAB EXAMPLES 


177 


Results 



(b) The 10-point DFT 

Comments: Compare the results of the even-numbered DFT and odd-numbered DFT. 


178 


THE DISCRETE FOURIER TRANSFORM 


Example 6.5: Zero-padding approach to improve (artificially) the quality of a 
measured signal 


Consider the free response of a single-degree-of-freedom system 


x(t) = 


^4 

— e S< 0 „t and f {*(/)} = 

0) d 


A 

a) 2 + j2t;co n co 


where A = 200, co n = 2 nf n = 27r(10)and cod = co ny / 1 — £ 2 . In order to simulate a prac- 
tical situation, a small amount of noise (Gaussian white) is added to the signal. Suppose 
the system is heavily damped, e.g. £ = 0.3; then the signal x(t ) falls into the noise level 
quickly. 

Now, there are two possibilities of performing the DFT. One is to use only the 
beginning of the signal where the signal-to-noise ratio is high, but this will give a poor 
frequency resolution. The other is to use longer data (including the noise-dominated part) 
to improve the frequency resolution. However, it is significantly affected by noise in the 
frequency domain. 

The above problem may be resolved by truncating the beginning part of signal and 
adding zeros to it (this increases the measurement time artificially). 


Line MATLAB code Comments 


1 clear all 

2 fs=100; T=5; 

3 t=[0:l/fs:T-l/fs]; 

4 A=200; zeta=0.3; wn=2*pi* 10; 
wd=sqrt( 1 -zeta~2)* wn; 

5 x=( A/wd) * exp(-zeta* wn* t) . * sin(wd* t) ; 

6 var_x=sum((x-mean(x))."2)/(length(x)- 1 ); 
% var_x=var(x) 

7 randn('state',0); 


8 noise=0.05* sqrt(var_x)*randn(size(x)); 

9 xn=x+noise; 


10 figure(l) 

1 1 plot(t, xn) 

12 axis([0 2 -0.8 2.2]) 

13 xlabel('\itt\rm (seconds)'); 
ylabel('\itx\rm(\itt\rm)') 


Define the sampling rate fs = 100 Hz, 
total record time T = 5 seconds, and the 
time variable t from 0 to ‘T-l/fs’ seconds. 
Also generate the sampled signal 
according to the equation above. 

Calculate the variance of the signal (note 
that the MATLAB function ‘var(x)’ can 
also be used). 

MATLAB function ‘randn(size(x))’ 
generates the normally distributed random 
numbers with the same size as x, and 
‘randn('state', 0)’ initializes the random 
number generator. 

Generate the noise sequence whose power 
is 0.25 % of the signal power that gives 
the SNR of approximately 26 dB (see 
Equation (5.30)). Then, add this noise to 
the original signal. 

Plot the noisy signal. It can be easily 
observed that the signal falls into the noise 
level at about 0.4 seconds. Note that 0.4 
seconds corresponds to the 40 data points. 
Thus, for the DFT, we may use the signal 
up to 0.4 seconds (40-point DFT) at the 
expense of the frequency resolution, or 
use the whole noisy signal (500-point 
DFT) to improve the resolution. 
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14 

Xnl=fft(xn,40); % 40 corresponds to 

First, perform the DFT using only the first 


0.4 seconds in time 

40 data points of the signal. The 

15 

Nl=length(Xnl); 

MATLAB function ‘fft(xn, 40)' performs 


fl=fs*(0:Nl-l)/Nl : 

the DFT of xn using the first 40 elements 

16 

Xn2=fft(xn); 

of xn. Next, perform the DFT using the 

17 

N2=length(xn); 

whole noisy signal (500-point DFT). 


f2=fs*(0:N2-l)/N2; 

Calculate the corresponding frequency 
variables. 

18 

Xa=A./(wn~2 - (2*pi*f2)72 + 

Calculate the Fourier integral according to 


i* 2* zeta* wn* (2* pi*f2)) ; 

the formula above. This will be used for 
the purpose of comparison. 

19 

figure(2) 

Plot the modulus of the 40-point DFT 

20 

plot(fl(l:Nl/2+l), 

(solid line), and plot the true magnitude 


20* log 1 0(abs(Xn 1 ( 1 :N 1 /2 + 1 )/fs))) 

spectrum of the Fourier transform (dashed 

21 

hold on 

line). Note the poor frequency resolution 

22 

plot(f2( 1 :N2/2+ 1 ), 

20* log 1 0(abs(Xa( 1 :N2/2+ 1 ))), 'r:') 

in the case of the 40-point DFT. 

23 

xlabel('Frequency (Hz)'); 
ylabel('Modulus (dB)'); hold off 


24 

figure(3) 

Plot the modulus of the DFT of the whole 

25 

plot(f2( 1 :N2/2+ 1 ), 

noisy signal (solid line), and plot the true 


20* log 1 0(abs(Xn2( 1 :N2/2+ 1 )/fs))) 

magnitude spectrum of the Fourier 

26 

hold on 

transform (dashed line). Note the effect of 

27 

plot(f2( 1 :N2/2+ 1), 

20* log 1 0(abs(Xa( 1 :N2/2+ 1 ))), 'r:') 

noise in the frequency domain. 

28 

xlabel('Frequency (Hz)'); 
ylabel('Modulus (dB)'); hold off 


29 

Xnz=fft(xn( 1 :40),N2) ; 

Now, perform the DFT of the truncated 
and zero-padded signal. The MATLAB 
function ‘fft(xn(l:40),N2)’ takes only the 
first 40 data elements of xn, then adds 
zeros up to the number N2. 

30 

figure(4) 

Plot the modulus of the DFT of the 

31 

plot(f2( 1 :N2/2+ 1 ) , 

zero-padded signal (solid line), and plot 


20* log 1 0(abs(Xnz( 1 :N2/2+ 1 )/fs))) 

the true magnitude spectrum of the 

32 

hold on 

Fourier transform (dashed line). Note the 

33 

plot(f2( 1 :N2/2+ 1 ) , 

20* log 1 0(abs(Xa( 1 :N2/2+ 1 ))), 'r:') 

improvement in the frequency domain. 

34 

xlabel('Frequency (Hz)'); 
ylabel('Modulus (dB)'); hold off 
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Results 



t (seconds) 

(a) Time signal with additive noise (SNR is about 26dB) 



(b) The 40-point DFT (truncated at 0.4 seconds) 


-20 

-25 


-30 

S' 

3-35 

£ 

•§-40 

o 

2 

-45 


-50 


-55 



(c) The 500-point DFT (truncated at 5 seconds) (d) The 500-point DFT (truncated at 0.4 seconds, and 

added zeros up to 5 seconds) 


Comments: In this example, apart from the zero-padding feature, there is another aspect 
to consider. Consider the DFT of the noise-free signal (i.e. noise is not added), and 
compare it with the Fourier integral. To do this, add the following lines at the end of the 
above MATLAB code: 

X=ffi(x); 

figure(5) 

plot(f2( 1 :N2/2+ 1 ), 20*logl0(abs(X(l:N2/2+l)/fs))); hold on 
plot(f2( 1 :N2/2+ 1 ), 20*logl0(abs(Xa(l:N2/2+l))), ’r:') 
xlabel('Frequency (Hz)'); ylabel('Modulus (dB)'); hold off 

The results are shown in Figure (e). Note the occurrence of aliasing in the 
DFT result. In computer simulations, we have evaluated the values of x(t) at t = 
0, l// s , 2//j, . . . , T — 1 /f s simply inserting the time variable in the equation without 
doing any preprocessing. In the MATLAB code, the act of defining the time variable 
‘t=[0: l/fs:T-l/fs] ; ’ is the ‘sampling’ of the analogue signal x(t). Since we cannot (in a 
simple way in computer programming) apply the low-passfilter before the sampling, we 
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always have to face the aliasing problem in computer simulations. Note that aliasing does 
occur even if the signal is obtained by solving the corresponding ordinary differential 
equation using a numerical integration method such as the Runge-Kutta method. Thus, 
we may use a much higher sampling rate to minimize the aliasing problem, but we cannot 
avoid it completely. 

Note also that aliasing occurs over the ‘entire’ frequency range, since the original 
analogue signal is not band-limited. It is also interesting to compare the effect of aliasing 
in the low-frequency region (compared with the natural frequency, /„ = 10 Hz) and in the 
high-frequency region, i.e. the magnitude spectrum is increased at high frequencies, but 
decreased at low frequencies. This is due to the phase structure of the Fourier transform 
of the original signal, i.e. arg X(f). Note further that there is a phase shift at the natural 
frequency (see Fahy and Walker, 1998). Thus the phase difference betweenX(/)and its 
mirror image is approximately 2n at the folding frequency and is approximately n at 
zero frequency. In other words, X(f) and the aliased part are in phase at high frequencies 
(increase the magnitude) and out of phase at low frequencies (decrease the magnitude), 
as can be seen from Figures (f) and (g). 



(e) DFT of noise-free signal 



(f) Magnitude spectrum of the DFT in full frequency range 



0 10 20 30 40 50 60 70 80 90 100 


Frequency (Hz) 

(g) Phase spectrum of the DFT in lull frequency range 


182 


THE DISCRETE FOURIER TRANSFORM 


Example 6.6: Circular (periodic) and linear convolutions using the DFT 


Consider the following two finite sequences of length N = 5: 

x(n)= [135 31] and h(n) = [97 5 3 1] 

Perform the circular convolution and the linear convolution using the DFT. 


Line 

MATLAB code 

Comments 

i 

clear all 

Define the sequences x(ri) and h(ri). 

2 

x=[l 3 5 3 1]; h=[9 7 5 3 11: 


3 

X=fft(x); H=fft(h); 

Perform the DFT of each sequence. Take the 

4 

yp=ifft(X.*H); 

inverse DFT of the product X(k) and H(k) to 

5 

np=0:4; 

obtain the circular convolution result. Define 
the variable for the x-axis. 

6 

figure(l) 

Plot the sequences x and h, and the results of 

7 

subplot(3,l,l); stemfnp, x, 'd', 'fill') 

circular convolution. Note that the sequences 

8 

axis([-0.4 4.4 0 6]) 

x and h are periodic in effect. 

9 

xlabel('\itn'); 

ylabel('\itx_p\rm(\itn\rm)') 


10 

subplot(3,l,2); stemfnp, h, 'fill') 


11 

axis([-0.4 4.4 0 10]) 


12 

xlabel('\itn'); 

ylabel('\ith_p\rm(\itn\ml) , ) 


13 

subplot(3,l,3); stemlnp, yp, 'fill') 


14 

axis([-0.4 4.4 0 90]) 


15 

xlabel('\itn'); 

ylabel('\ity_p\rm(\itn\rm)') 


16 

Xz=fft([x zeros(l,length(h)-l)]); 

Perform the linear convolution using the 

17 

Hz=fft([h zeros(l,length(x)-l)]); 

DFT. Note that zeros are added appropriately 

18 

yz=ifft(Xz.*Hz); 

when calculating DFT coefficients. 

19 

nz=0:8; 

Also, note that the MATLAB function 
‘conv(x, h)’ will give the same result (in fact, 
this function uses the same algorithm). 

20 

figure(2) 

Plot the zero-padded sequences, and the 

21 

subplot(3,l,l); 

stem(nz, [x 0 0 0 0], 'd', 'fill') 

results of linear convolution using the DFT. 

22 

axis([-0.4 8.4 0 6]) 


23 

xlabel('\itn'); 

ylabel('\itx\rm(\itn\rm)') 


24 

subplot(3,l,2); 
stem(nz, [h 0 0 0 0], 'fill') 


25 

axis([-0.4 8.4 0 10]) 


26 

xlabel('\itn'); 

ylabel('\ith\rm(\itn\rm)') 


27 

subplot(3,l,3); stem(nz, yz, 'fill') 


28 

axis([-0.4 8.4 0 90]) 


29 

xlabel('\itn'); 

ylabel('\ity\rm(\itn\rm)') 
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Results 



n 


(a) Circular convolution, y (n) = x (n) *h (n) = x(n) © h(n) 





(b) Linear convolution using the DFT, y(n) = x(n) * h(n) 


Example 6.7: System identification (impact testing of a structure) 

Consider the experimental setup shown in Figure (a) (see also Figure 1.11 in Chapter 1), 
and suppose we want to identify the system (FRF between A and B) by the impact 
testing method. Note that many modern signal analysers are equipped with built-in signal 
conditioning modules. 
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(a) Experimental setup 


If the measurement noise is ignored, both input and output are deterministic and 
transient. Thus, provided that the input x(t) is sufficiently narrow in time (broad in fre- 
quency), we can obtain the FRF between A and B over a desired frequency range from 
the relationship 


Uf) = H(f) = — ^ (6.55) 

x(j ) 

However, as illustrated in Figure (a), the actual signals are contaminated with noise. Also, 
the system we are identifying is not the actual physical system H (between A and B) but 
the H that includes the individual frequency responses of sensors and filters, the effects of 
quantization noise, measurement (external) noise and the experimental rig. Nevertheless, 
for convenience we shall use the notation H rather than H . 

Measurement noise makes it difficult to use Equation (6.55). Thus, we usually 
perform the same experiment several times and average the results to estimate H(f). 
The details of various estimation methods are discussed in Part II of this book. Roughly 
speaking, one estimation method of FRF may be expressed as 

^ E ku'wj) 

Hi (/) ~ ^ (6.56) 

^ X>:(/)x„(/) 

n= 1 

where Vis the number of times the experiment is replicated (equivalently it is the number 
of averages). Note that different values of X n (f) and Y„(f ) are produced in each experi- 
ment, and if N = 1 Equations (6.55) and (6.56) are the same. In this MATLAB example, 
we shall estimate the FRF based on both Equations (6.55) and (6.56), and compare the 
results. 
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The experiment is performed 10 times, and the measured data are stored in the 
file ‘impact_data_raw.matV where the sampling rate is chosen as f s = 256 Hz, and each 
signal is recorded for 8 seconds (which results in the frequency resolution of A / = 1 /8 = 
0.125Hz,andeachsignalis 2048 elements long). The variables in the file are ‘inl,in2, . . . , 
inlO’ (input signals) and ‘outl, out2, . . . , outlO’ (output signals). The anti-aliasing filter 
is automatically controlled by the signal analyser according to the sampling rate (in this 
case, the cut-off frequency is about 100 Hz). Also, the signal analyser is configured to 
remove the d.c. component of the measured signal (i.e. high-pass filtering with cut-on at 
about 5 Hz). 

Before performing the DFT of each signal, let us investigate the measured signals. 
If we type the following script in the MATLAB command window: 

load impact_data_raw 

fs=256; N=length(inl); f=fs*(0:N-l)/N; 

T=N/fs; t=0:l/fs:T-l/fs; 
figure(l); plot(t, ini); axis([-0. 1 8 -1.5 2.5]) 
xlabel('\itt\rm (seconds)'); ylabel('\itx\rm(\itt\rm)') 
figure(2); plot(t, outl); axis([-0.1 8 -4 4]) 
xlabel('\itt\rm (seconds)'); ylabel('\ity\rm(\itt\rm)') 

The results will be as shown in Figure (bl) and (b2). 



t (seconds) 


t (seconds) 


Note that the output signal is truncated before the signal dies away completely. 
However, the input signal dies away quickly and noise dominates later. If we type in the 
following script we can see the effect of noise on the input signal, i.e. the DFT of the 
input signal shows a noisy spectrum as in Figure (c): 

Inl=fft(inl); 

figure (3); plot(f(l:N/2-t-l), 20*logl0(abs(Inl(l:N/2+l)))) 
xlabel('Frequency (Hz)'); ylabel('Modulus (dB )') 
axis([0 128 -70 30]) 


The data files can be downloaded from the Companion Website (www.wiley.com/go/shin_hammond) 
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Now let us look at the input signal in more detail by typing 
plot(inl(l:50)); grid on 

As shown in Figure (dl), the input signal after the 20th data point and before the 
4th data point is dominated by noise. Thus, similar to MATLAB Example 6.5, the data 
in this region are replaced by the noise level (note that they are not replaced by zeros due 
to the offset of the signal). The following MATLAB script replaces the noise region with 
constant values and compensates the offset (note that the output signal is not offset, so it 
is replaced with zeros below the 4th data point): 

inl(l:4)=inl(20); inl(20:end)=inl(20); inl=inl-inl(20); 
outl(l:4)=0; 


The result is shown in Figure (d2). 
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Now we type the script below to see the effect of this preprocessing, which is a 
much cleaner spectrum as in Figure (e). Note that each signal has a different transient 
characteristic, so it is preprocessed individually and differently. The preprocessed data 
set is stored in the file ‘impact_data_pre_processed.mat’. 

Inl=fft(inl); 

plot(f( 1 :N/2+ 1 ), 20* log 1 0(abs(ln 1 ( 1 :N/2+ 1 )))) 
xlabel('Frequency (Hz)'); ylabel('Modulus (dB)') 
axis([0 128 -70 30]) 
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Now, using these two data sets, we shall estimate the FRF based on both Equations 
(6.55) and (6.56). 


Case A: FRF estimate by Equation (6.55), i.e. 


H{f) = 


Y(f ) 
X(f) 


Line 

MATLAB code 

Comments 

i 

clear all 

Load the data set which is not preprocessed. 

2 

load impact_data_raw 

Define frequency and time variables. 

3 

fs = 256; N = length(inl); 
f=fs*(0;N-l)/N; 


4 

T=N/fs; t=0:l/fs:T-l/fs; 


5 

Inl=fft(inl); Outl=fft(outl); 

Perform the DFT of input signal and output 

6 

H=Outl ./In 1 ; 

signal (only one set of input-output records). 
Then, calculate the FRF according to 
Equation (6.55). 

7 

figure(l) 

Plot the magnitude and phase spectra of the 

8 

plot(f(41:761), 

FRF (for the frequency range from 5 Hz to 


20* log 1 0(abs(H(4 1:761)))) 

95 Hz). 

9 

axis([5 95 -30 50]) 


10 

xlabel('Frequency (Hz)'); 
ylabel('FRF (Modulus, dB)') 


11 

figure(2) 


12 

plot(f(4 1:761), 
unwrap(angle(H(4 1:761 )))) 


13 

axis([5 95 -3.5 3.5]) 


14 

xlabel('Frequency (Hz)'); 
ylabel('FRF (Phase, rad)') 


15 

load impact_data_pre_processed 

Load the preprocessed data set, and perform 

16 

In 1 =fft(in 1 ); Outl=fft(outl); 

the DFT. Then, calculate the FRF according 

17 

H=Outl./Inl; 

to Equation (6.55). 
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18 figure(3) Plot the magnitude and phase spectra of the FRF. 

19 plot(f(41:761), 

20* log 1 0(abs(H(4 1:761)))) 

20 axis([5 95 -30 50]) 

21 xlabel('Frequency (Hz)'); 
ylabelfFRF (Modulus, dB )') 

22 figure(4) 

23 plot(f(41:761), 
unwrap(angle(H(4 1:761 )))) 

24 axis([5 95 -3.5 3.5]) 

25 xlabelOFrequency (Hz)'); 
ylabel('FRF (Phase, rad)') 


Results 
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Comments: Note that the preprocessed data produce a much cleaner FRF. 


Case B: FRF estimate by Equation (6.56), i.e. 


^ !>:(/) w) 

n = 1 


if>„*(/)X„(/) 

n= 1 
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Line MATLAB code Comments 


1 

clear all 

2 

load impact_data_raw 

3 

% load impact_data_pre_processed 

4 

fs = 256; N = length(inl); 
f=fs*(0:N-l)/N; 

5 

T=N/fs; t=0:l/fs:T-l/fs; 

6 

Navg=10; 

% Navg=3 for preprocessed data set 

7 

for n=l:Navg 

8 

In = ['In', int2str(n), '= fft(in', 
int2str(n), ');']; 

9 

eval(In); 

10 

Out = ['Out', int2str(n), '= fft(out', 
int2str(n), ');']; 

11 

eval(Out); 

12 

Sxx = ['Sxx', int2str(n), '=conj(In', 
int2str(n), ')' '.*In', int2str(n), 

13 

eval(Sxx); 

14 

Sxy = ['Sxy', int2str(n), '= conj(In' 
int2str(n), ')' '.*Out', int2str(n), 

15 

eval(Sxy); 

16 

end 

17 

Sxx=[]; Sxy=[]; 

18 

for n=l:Navg 

19 

tmpl= ['Sxx', int2str(n), 

20 

Sxx=[Sxx; eval(tmpl)]; 

21 

tmp2= ['Sxy', int2str(n), 

22 

Sxy=[Sxy; eval(tmp2)]; 

23 

end 

24 

Sxx=mean(Sxx); Sxy=mean(Sxy); 

25 

Hl=Sxy./Sxx; 

26 

figure(l) 

27 

plot(f(4 1:761), 

20* log 1 0(abs(H(4 1:761 )))) 

28 

axis([5 95 -30 50]) 

29 

xlabel('Frequency (Hz)'); 
ylabel('FRF (Modulus, dB)') 

30 

figure(2) 

31 

plot(f(4 1:761), 
unwrap(angle(H 1 (4 1 :76 1 )))) 

32 

axis([5 95 -3.5 3.5]) 

33 

xlabel('Frequency (Hz)'); 
ylabel('FRF (Phase, rad)') 


Load the data set which is not preprocessed 
(Line 2). 

Later, comment out this line (with %), and 
uncomment Line 3 to load the preprocessed 
data set. 

Define frequency and time variables. 

Define the number of averages N = 10 (see 
Equation (6.56)). Later, use N = 3 for the 
preprocessed data set. 

This ‘for’ loop produces variables: Ini, 

In2, . . . , InlO; Outl, Out2, . . . , OutlO; Sxxl, 
Sxx2, . . . , SxxlO; Sxyl, Sxy2, . . . , SxylO. 
They are the DFTs of input and output signals, 
and the elements of the numerator and 
denominator of Equation (6.56), such that, for 
example, Ini = Xi, Outl = Y\, 

Sxxl = X\Xi and Sxyl = X\Y x . 

For more details of the ‘eval’ function see the 
MATLAB help window. 


Define empty matrices which will be used in 
the ‘for’ loop. 

The ‘for’ loop produces two matrices Sxx and 
Sxy, where the nth row of the matrices is 
X*X n and X* Y n , respectively. 


First calculate the numerator and denominator 
of Equation (6.56), and then Hi is obtained. 

Plot the magnitude and phase spectra of the 
FRF. 

Run this MATLAB program again using the 
preprocessed data set, and compare the results. 
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Results 
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(without preprocessing) 


(g2) 
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No. of averages = 3 
(with preprocessing) 


(g4) 


10 20 30 40 50 60 70 80 90 

Frequency (Hz) 


Comments: Comparing Figures (gl), (g2) with (fl), (f2) in Case A, it can be seen that 
averaging improves the FRF estimate. The effect of averaging is to remove the noises 
which are ‘uncorrelated" with the signals x(t ) and v(f), as will be seen later in Part II of 
this book. Note that preprocessing results in a much better FRF estimate using far fewer 
averages, as can be seen from Figures (g3) and (g4). 


Part II 

Introduction to Random Processes 


7 


Random Processes 


Introduction 

In Part I, we discussed Fourier methods for analysing deterministic signals. In Part II, 
our interest moves to the treatment of non-deterministic signals. There are many ways in 
which a signal may be characterized as non-deterministic. At this point we shall say that 
the time history of the signal cannot be predicted exactly. We may consider the signal 
shown in Figure 7. 1 as a sample of a non-deterministic signal. 



Figure 7.1 A sample of a non-deterministic signal 

An example of such a signal might be the time history measured from an accelerom- 
eter mounted on a ventilation duct. In order to be able to describe the characteristics of 
such a time history we need some basic ideas of probability and statistics. So we shall 
now introduce relevant concepts and return to showing how we can use them for time 
histories in Chapter 8. 


7.1 BASIC PROBABILITY THEORY 

The mathematical theory of describing uncertain (or random) phenomena is probability theory. 
It may be best explained by examples - games of chance such as tossing coins, rolling dice, 
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etc. First, we define a few terms: 

(a) An experiment of chance is an experiment whose outcome is not predictable. 

(b) The sample space is the collection (set) of all possible outcomes of an experiment, and 

is denoted by £2. For example, if an experiment is tossing a coin, then its sample space 
is £2 = (//, T), where H and T denote head and tail respectively, and if an experiment is 
rolling a die, then its sample space is £7 = (1, 2 6). 

(c) An event is the outcome of an experiment and is the collection (subset) of points in the 
sample space, and denoted by E. For example, ‘the event that a number < 4 occurs when 
a die is rolled" is indicated in the Venn diagram shown in Figure 7.2. Individual events in 
the sample space are called elementary events, thus events are collections of elementary 
events. 



Figure 7.2 Sample space (£2) and event (E) 

The sample space £2 is the set of all possible outcomes, containing riQ elements. The 
event £ is a subset of £2, containing he elementary events. 


(d) Probability: To each event £ in a sample space £2, we may assign a number which 
measures our belief that £ will occur. This is the probability of occurrence of event 
£, which is written as Prob[£] = £(£). In the case where each elementary event is 
equally likely, then it is ‘logical’ that 

n f 

P(E) = — (7.1) 

na 

This is a measure of the ‘likelihood of occurrence’ of an ‘event’ in an ‘experiment of 
chance’, and the probability of event £ in the above example is £(£) = = 

2/3. Note that P(E) is a ‘number’ such that 

0 < P(E) < 1 (7.2) 

From this, we conclude that the probability of occurrence of a ‘certain’ event is one 
and the probability of occurrence of an ‘impossible’ event is zero. 


Algebra of Events 

Simple ‘set operations’ visualized with reference to Venn diagrams are useful in setting up 
the basic axioms of probability. Given events A, B, C, . . . in a sample space £2, we can define 
certain operations on them which lead to other events in £2. These may be represented by Venn 
diagrams. If event A is a subset of £2 (but not equal to £2) we can draw and write A and £2 
as in Figure 7.3(a), and the complement of A is A! (i.e. not A) denoted by the shaded area as 
shown in Figure 7.3(b). 
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Q. 


Q. 


(a) (b) 

Figure 7.3 Event A as a subset of fl and its complement 



Alj B= C (C is the shaded area) 



Figure 7.4 Union and intersection of two sets A and B 


The union (sum) of two sets A and B is the set of elements belonging to A or B or 
both, and is denoted by A U B. The intersection (or product) of two sets A and B is the set 
of elements common to both A and B, denoted by A fl B. In Venn diagram terms, they are 
shown as in Figure 7.4. 

If two sets have no elements in common, we write A fl B = <t> (the null set). Such events 
are said to be mutually exclusive. For example, in rolling a die, if A is the event that a number 
< 2 occurs, and B is the event that a number > 5 occurs, then A fl B = <t>, i.e. <t> corresponds 
to an impossible event. 

Some properties of set operations are: 


(a) 

A U (B n C) = 

= (A 

U B) n (A U C) 

(7.3) 

(b) 

A n (B U C) = 

= (A 

n B) u (A n C) 

(7.4) 

(c) 

(A U B)' = A' 

n B 

/ 

(7.5) 

(d) 

For any set A, 

let n 

(A) denote the number of elements in A; then 





n(A U B) = n(A) + n{B) — n{A fl B) 

(7.6) 


Two different cases are shown in Figure 7.5 to demonstrate the use of Equation (7.6). 


Case (a): 



A and B are disjoint, i.e. n(A nS) = 0. 
Thus, n(/l uS) = n(A) + n(B) 


Case (b): 



These elements are counted twice and 
so n(A n B) must be subtracted. 


Figure 7.5 Demonstration of n(A U B) for two different cases 
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Algebra of Probabilities 


The above intuitive ideas are formalized into the axioms of probability as follows. To 
each event £,■ (in a sample space fl), we assign a number called the probability of £,- 
(denoted £(£,)) such that 

(a) 0 < P(Ei) < 1 

(b) If Ej and Ej are mutually exclusive, then 

(7.7) 

E(£, U £,) = £(£,) +£(£,) 

(7.8) 

(c) If (J Ej = fl, then 


7 > (UO = 1 

(7.9) 

(d) P(4>) = 0 

(e) For any events E\, E 2 , not necessarily mutually exclusive. 

(7.10) 

P(Ei U Ei) = P(Ei) + P(E 2 ) - P(Ei n Ei) 

(7.11) 

Equally Likely Events 


If n events, £ 1 , E 2 , . . . ,£„, are judged to be equally likely, then 


P(Ej) = - 
n 

(7.12) 

As an example of this, throw two dice and record the number on each face. What is 
the probability of the event that the total score is 5? The answer is £(£ 5 ) = = 

4/36 = 1/9. 


Joint Probability 

The probability of occurrence of events A and B jointly is called a joint probability and is 
denoted P(A fl B ) or P(A, B). With reference to Figure 7.6, this is the occurrence of the 
shaded area, i.e. 


P(A Cl B) = 


n Ar\B 

na 


(7.13) 



Figure 7.6 The intersection of two sets A and B in a sample space Q, 
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Conditional Probability 

The probability of occurrence of an event A given that event B has occurred is written as 
P(A\B), and is called a conditional probability. To explain this, consider the intersection of 
two sets A and B in a sample space £2 as shown in Figure 7.6. To compute P(A\B), in effect 
we are computing a probability with respect to a ‘reduced sample space’, i.e. it is the ratio of 
the number of elements in the shaded area relative to the number of elements in B, namely 
which may be written (n AnB /n n )/(n B /n n ), or 


P{A\B) = 


P(A n B) 
PiB ) 


P(A , B) 
P(B ) 


(7.14) 


Statistical Independence 

If P(A\B) — P(A), we say event A and B are statistically independent. Note that this is so 
if P(A = P(A)P{B). As an example of this, toss a coin and roll a die. The probability 
that a coin lands head and a die scores 3 is P(A fl B) = 1/2 ■ 1/6 = 1/12 since the events are 
independent, i.e. knowing the result of the first event (a coin lands head or tail) does not give 
us any information on the second event (score on the die). 


Relative Frequencies''' 1 ' 


As defined in Equations (7.1) and (7.2), the probability of event £ in a sample space 
£2, P(E) is a theoretical concept which can be computed without conducting an exper- 
iment. In the simple example above this has worked based on the assumption of equal 
likelihood of occurrence of the elementary events. When this is not the case we resort 
to measurements to ‘estimate’ the probability of occurrence of events. We approach this 
via the notion of relative frequency (or proportion) of times that E occurs in a long series 
of trials. Thus, if event E occurs n B times in N trials, then the relative frequency of E is 
given by 


/e = 


ns 

N 


(7.15) 


Obviously, as N changes, so does f B . For example, toss a coin and note/// (the relative 
frequency of a head occurring) as N increases. This is shown in Figure 7.7. 




600 


Figure 7.7 Illustration of f H as N increases 
This graph suggests that the error is ‘probably’ reduced as N gets larger. 


198 


RANDOM PROCESSES 


The above notion of relative frequency is not useful as a definition of probability because 
its values are not unique, but it is intuitively appealing and is used to estimate probabilities 
where applicable, i.e. /g is often taken as an estimate of P(E). The relative frequency is 
sometimes referred to as the ‘empirical’ probability since it is deduced from observed data. 
This estimate has the following properties: 

(a) For all events A, 


/a > 0 (non-negativity) 


(7.16) 


(b) For all mutually exclusive events, 


Iaub = HA+ N nB — f A T f B (additivity) (7.17) 

(c) For any set of collectively exhaustive events, Ai, A 2 , . . . , i.e. Ai U A 2 U • • • = U A;, 

fjA t = ^ = 1 (certainty) (7.18) 

i.e. a ‘certain’ event has a relative frequency of ‘1’. 


7.2 RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS 

In many cases it is more convenient to define the outcome of an experiment as a set of 
numbers rather than the actual elements of the sample space. So we define a random variable 
as a. function defined on a sample space. For example, if £2 = ( H , T) for a coin tossing 
experiment, we may choose to say we get a number ‘ 1 ’ when a head occurs and ‘0’ when the 
tail occurs, i.e. we ‘map’ from the sample space to a ‘range space’ or a new sample space as 
shown in Figure 7.8. We may write the function such that X(H) — 1 and X(T) — 0. More 
generally, for any element «,■ in £2, we define a function X(o>i). 

Note that the number of elements of £2 and the number of values taken by X(a>j) need not 
be the same. For an example, toss two coins and record the outcomes and define the random 
variable X as the number of heads occurring. This is shown in Figure 7.9. 


r > 

-Y(tt)) 

r n 




LI * 


r u 


Sample space Range space 


Figure 7.8 A random variable X that maps from a sample space £2 to a range space £2x 


(. H,H ) ( H,T ) 

X( ®,) 

0 1 2 

( T,H ) ( T,T ) 


V ^ "jj 

' 


Figure 7.9 An example of a random variable X 
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We note that the values taken by a random variable are denoted x,-, i.e. X(a>j) = jc, , and 
the notation X(a>i ) is often abbreviated to X. In many cases the sample space and the range 
space ‘fuse’ together, e.g. when the outcome is already a number (rolling a die, recording a 
voltage, etc.). 

There are two types of random variable. If the sample space fix consists of discrete 
elements, i.e. countable, X is said to be a discrete random variable, e.g. rolling a die. If fix 
consists of ‘continuous" values, i.e. uncountable (or non-denumerable), then X is a continuous 
random variable, e.g. the voltage fluctuation on an ‘analogue’ meter. Some processes may be 
mixed, e.g. a binary signal in noise. 


Probability Distributions for Discrete Random Variables 

For a discrete random variable X (which takes on only a discrete set of values xi, x 2 , . . . ), 
the probability distribution of X is characterized by specifying the probabilities that the 
random variable X is equal to Xj, for every x;, i.e. 

P [X = xi] forx,- = jci, X 2 , . . . (7.19) 

where P[X = j c, ] describes the probability distribution of a discrete random variable X 
and satisfies £\ P[X = x,] = 1, e.g. for rolling a die, the probability distribution is as 
shown in Figure 7. 10. 


P\X=x& 

1/6 


T T T T T T 

1 2 3 4 5 6 


x i 


Figure 7.10 Probability distribution for rolling a die 


The Cumulative Distribution 

Random variables have a (cumulative) distribution function (cdf). This is the probability 
of a random variable X taking a value less than or equal to x. This is described by F(x) 
where 

F(x) = P[X < x\ = Prob[X taking on a value up to and including x\ (7.20) 

For a discrete random variable there are jumps in the function F(x ) as shown in 
Figure 7.11 (for rolling a die). 


F(x) 


5/6 

1/2 



1/6 


1 2 


4 


6 


x 


Figure 7.11 Cumulative distribution function for rolling a die 
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Since probabilities are non-negative the cumulative distribution function is mono- 
tonic non-decreasing. 


Continuous distributions 

For a continuous process, the sample space is infinite and non-denumerable. So the 
probability that X takes the value x is zero, i.e. P[X = x] = 0. Whilst technically correct 
this is not particularly useful, since X will take specific values. So a more useful approach 
is to think of the probability of X lying within intervals on the x-axis, i.e. P[X > a], 
P[a < X < b], etc. 

We start by considering the distribution function F(x) = P[X < x], F(x) must have 
a general shape such as the graph shown in Figure 7.12. 


x 



Figure 7.12 An example of a distribution function for a continuous process 
From Figure 7.12, some properties of F(x) are: 

(a) F(— oo) = 0. F(oo) = 1 (7.21) 

(b) F(x 2 ) > F(xi) forx 2 > xi (7.22) 

(c) P[a < X < b\ = P[X < b\ — P[X < a\ = F(b) — F{a) for a < b (7.23) 


Probability Density Functions 

Using the properties of distribution function F(x), the probability of X lying in an interval 
x to x + Sx can be written as 


P[x < X < x + <5x] = F(x + Sx) — F(x) (7.24) 


which shrinks to zero as Sx 0. However, consider P[x < X < x + Sx]/Sx. This is the 
probability of lying in a band (width Sx) divided by that bandwidth. Then, if the quantity 
limj x _>o P[x < X < x + Sx]/Sx exists it is called the probability density function (pdf) 
which is denoted p{x) and is (from Equation (7.24)) 


p(x) = lint 


P [x < X < x + Sx] 
Sx 


dF(x) 

dx 


(7.25) 


From Equation (7.25) it follows that 


X 

FW =/ 

— OO 


p{u)du 


(7.26) 
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Some properties of the probability density function p(x) are: 


(a) p{x) > 0 

i.e. the probability density function is non-negative; 

(7.27) 

(b) / p(x)dx = 1 

(7.28) 

i.e. the area under the probability density function is unity; 

l. 


u 

(c) P[a < X < b] = f p(x)dx 

a 

(7.29) 

As an example of Equation (7.29), P[a < X < b] can be found by evaluating the 
shaded area shown in Figure 7.13. 



Figure 7.13 An example of a probability density function 


Note that we can also define the probability density function for a discrete random variable 
if the properties of delta functions are used. For example, the probability density function for 
rolling a die is shown in Figure 7.14. 



p(x) = dF( x ) = —S(x — X:), x- =1,2,..., 6 
dx 6 



1 2 3 4 5 6 


Figure 7.14 Probability density function for rolling a die 


Joint Distributions 

The above descriptions involve only a single random variable X. This is a univariate process. 
Now, consider a process which involves two random variables (say X and T), i.e. a bivariate 
process. The probability that X < x occurs jointly with Y < y is 


P[X <x n Y < y] = F(x, y) 


(7.30) 
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Note thatF(— oo, y) = F(x, — oo) = 0, F(o o, oo) = 1, F(x, oo) = F(x) and F(o o, y) = 
F{y). Similar to the univariate case the ‘joint probability density function' is defined as 


P[x < X < x + Sx Cly < Y < y + «Sy] d 2 F(x, y) 

p(x, y) — lim : = 

SxSy dxdy 


(7.31) 


and 


x y 

Fix , >•) = // p(u, v)dvdu 


(7.32) 


Note that 


CXJ oo 

// 


X oo 

// 


p(x, y)dydx — 1 and p(u, v)dvdu = F(x) 


hence 


OC 

-J 


p{x) = / P(x, y)dy 


(7.33) 


This is called a ‘marginal’ probability density function. 

These ideas may be extended to n random variables, X \ , X 2 , ■ ■ . , X„, i.e. we may define 
p(x\, X 2 , , x„). We shall only consider univariate and bivariate processes in this book. 


7.3 EXPECTATIONS OF FUNCTIONS OF A RANDOM VARIABLE 

So far, we have used probability distributions to describe the properties of random vari- 
ables. However, rather than using probability distributions, we often use averages. This 
introduces the concept of the expectation of a process. 

Consider a discrete random variable X which can assume any values xi,X 2 , ■ with 
probabilities p \ , P 2 , .... If v; occurs times in N trials of an experiment, then the 
average value of X is 

' - , 'l2"iXi ( 7 -34) 

i 

where x is called the sample mean. Since n -J N = f t (the empirical probability of oc- 
currence of Xi), Equation (7.34) can be written as x = JT As N — »■ 00 , the empir- 
ical probability approaches the theoretical probability. So the expression for x becomes 
x i Pi an d this defines the theoretical mean value of X. 

For a continuous process, the probability pi may be replaced by the probability den- 
sity multiplied by the bandwidth, i.e. pi — »■ p(xi)8xi. So JT x ,- pi becomes JT x;p(x;)<5x,- 
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which as 5x; — > 0 is written xp{x)dx. This defines the theoretical mean value of X 
which we write as E[X ], the expected value of X, i.e. 


E[X] = 


oo 

/ xp(x)dx 


— OO 


(7.35) 


This is the ‘mean value’ or the ‘first moment’ of a random variable X. More generally, the 
expectation operation generalizes to functions of a random variable. For example, if Y = g(X), 
i.e. as shown in Figure 7.15, 



g ( ) 


Input 

System 



Y=g(X) 

Output 


Figure 7.15 System with random input and random output 


then the expected (or average) value of Y is 

OO 

E[Y] = £[g(X)] = J g(x)p(x)dx 

— OO 

For a discrete process, this becomes 


E[g(X)] = £>,)/>, 


(7.36) 


(7.37) 


This may be extended to functions of several random variables. For example, in a bivariate 
process with random variables X and Y, if W = g(X , Y), then the expected value of W is 

OO OO 

E[W] = E[g(X, Y)]= j j g(x, y)p(x , y)dxdy (7.38) 

— oo — oo 


Moments of a Random Variable 

The probability density function p(x) contains the complete information about the prob- 
ability characteristics of X, but it is sometimes useful to summarize this information in 
a few numerical parameters - the so-called moments of a random variable. The first and 
second moments are given below: 

(a) First moment (mean value): 

OO 

p. x — E[X] = J xp(x)dx (7.39) 

— OO 
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(b) Second moment (mean square value): 

OO 

E[X 2 ] = J x 2 p(x)dx (7.40) 

— OO 

Note that, instead of using Equation (7.40), the ‘central moments’ (moments about 
the mean) are usually used. The second moment about the mean is the called the variance, 
which is written as 

OO 

W) ■ of ■ E[(* - - / d - fitfpMdx (7 41, 

— OO 

where a x — f'Var(X) is called the standard deviation, and is the root mean square (rms) 
of a ‘zero’ mean variable. 


In many cases, the above two moments pt x and o 2 are the most important measures of 
a random variable X. However, the third and fourth moments are useful in considerations of 
processes that are non-Gaussian (discussed later in this chapter). 

The first moment p, x is a measure of ‘location’ of p(x) on the x-axis; the variance a 2 
is a measure of dispersion or spread of p(x) relative to p, x . The following few examples 
illustrate this. 


Some ‘Well-known’ Distributions 


A Uniform Distribution (Figure 7.16) 

This is often used to model the errors involved in measurement (see quantization noise dis- 
cussed in Chapter 5). 


P(x) 

1 

b — a 


a b 


x 


Mean value: jU x = 


a + b 
2 


w • 2 (b~a) 2 

Variance: cr = 

12 


Figure 7.16 Probability density function of a uniform distribution 


Rayleigh Distribution (Figure 7.17) 

This is used in fatigue analysis, e.g. to model cyclic stresses. 


p(x) = f-e - xlllc 2 for x > 0 
c 

= 0 otherwise 


p(x) 



Mean value: [l x = c^k/2 

... 2 ^ 2 

Variance: = c 


Figure 7.17 Probability density function of a Rayleigh distribution 
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Gaussian Distribution (Normal Distribution) 

This is probably the most important distribution, since many practical processes can 
be approximated as Gaussian (see a statement of the central limit theorem below). If a 
random variable X is normally distributed, then its probability distribution is completely 
described by two parameters, its mean value pt x and variance a 2 (or standard deviation 
oy), and the probability density function of a Gaussian distribution is given by 

p(x) = 1 e -U-^) 2 /2«l (7.42) 

oyV27r 

If p x = 0 and a 2 = 1, then it is called the ‘standard normal distribution’. For p x = 0, 
some examples of the Gaussian distribution are shown in Figure 7.18. 


/>W 



Figure 7.18 Probability density functions of Gaussian distribution 

The importance of the Gaussian distribution is illustrated by a particular property: let 
X\, X 2 , . . . be independent random variables that have their own probability distributions; 
then the sum of random variables, S„ = Ylk=i tends to have a Gaussian distribution 
as n gets large, regardless of their individual distribution of Xk . This is a version of 
the so-called central limit theorem. Moreover, it is interesting to observe the speed with 
which this occurs as n increases. M7 2 


For a Gaussian bivariate process (random variables X and Y ), the joint probability density 
function is written as 


1 

P(x, y ) = — 


1 


2n |S|'/ 2 


exp 


— m) T s '(y- m) 


(7.43) 


where 



V 

, -v = 

X 

and m = 


O xy 

J 


y _ 


fXy 


Also p x = E[X}, fly = E[Y], rr 2 = E[{X - p x ) 2 ], a 2 = E[{Y - p. y f) and a xy = E[{X - 
p. x ){Y — fly)} (this is discussed shortly). 
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Bivariate Processes ™ 1 3 


The concept of moments generalizes to bivariate processes, essentially based on Equation 
(7.38). For example, the expected value of the product of two variables X and Y is 

OO OO 


E[XY] = 



xyp(x, y)dxdy 


(7.44) 


— OO — OO 


This is a generalization of the second moment (see Equation (7.40)). If we centralize the 
process (i.e. subtract the mean from each) then 

OO OO 


Cov(X, Y) = = E[{X - p x )(Y - n y )] 



(x - dx)(y - d y )p(x, y)dxdy 


(7.45) 

E[X Y] is called the correlation between X and Y, andCov(X, Y) is called the covariance 
between X and Y. They are related by 


Cov(X, Y) = E[X Y] - p x p y = E[X Y] - E[X\E[Y\ (7.46) 

Note that the covariance and correlation are the same if p x = p y = 0. Some definitions 
for jointly distributed random variables are given below. 

X and Y are: 


(a) uncorrelated if E[X Y ] = E[X ] E[Y] (or Cov(X, Y) = 0) 

(note that, for zero-mean variables, if X and Y are uncorrelated, then £[X F] = 0); 

(b) orthogonal if £[X f] = 0; 

(c) independent {statistically) if p(x, y) = p(x)p(y). 


Note that, if X and Y are independent they are uncorrelated. However, uncorrelated 
random variables are not necessarily independent. For example, Let X be a random variable 
uniformly distributed over the range — 1 to 1 . Note that the mean value E [X] =0. Let another 
random variable Y — X 2 . Then obviously p{x, y) 7^ p(x)p{y), i.e. X and Y are dependent (if 
X is known, Y is also known). But Cov(X, Y) = E[X Y] — E[X]E[Y] = F[X 3 ] = 0 shows 
that they are uncorrelated (and also orthogonal). Note that they are related nonlinearly. 

An important measure called the correlation coefficient is defined as 



(7.47) 


This is a measure (or degree) of a linear relationship between two random variables, and the 
correlation coefficient has values in the range — 1 < p xy < 1. If \p xy \ = 1, then two random 
variables X and Y are ‘fully’ related in a linear manner, e.g. Y = aX + b, where a and b are 
constants. If p xy = 0, there is no linear relationship between X and Y . Note that the correlation 
coefficient detects only linear relationships between X and Y. Thus, even if p xy = 0, X and 
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Y can be related in a nonlinear fashion (see the above example, i.e. X and Y = X 2 , where X 
is uniformly distributed on —1 to 1). 

Some Important Properties of Moments 

(a) E[aX + b] = aE[X ] + b (a, b are some constants) (7.48) 

(b) E[aX + bY] = aE[X} + bE[Y\ (7.49) 

(c) Var(X) = E[X 2 ] -p 2 x = E[X 2 ] - E 2 [X ] (7.50) 

Proof: Var(X) = E\(X - p x f] = E[X 2 - 2p. x X + pf x \ 

= E[X 2 ] - 2fi x E[X] + p-l = E[X 2 ] - ii 2 x 

(d) Var(aX + b) = a 2 V ar(X) (7.51) 

(e) Cov(X, Y) = E[X Y] - p. x p, y = E[X Y] - E[X}E[Y} (7.52) 

(f) Var(X + Y) = Var(X) + Var(F) + 2Cov(X, Y) (7.53) 

Proof: Var(X + Y) = E[(X + Y) 2 ] - E 2 [{X + 7)] 

= E[X 2 + 2XY + Y 2 ] - E 2 [X ] - 2 E[X]E[Y] - E 2 [Y ] 

= ( E[X 2 ] - E 2 [X}) + ( E[Y 2 ] - E 2 [Y ]) + 2(E[XY] - £[X]£[7]) 
= Var(X) + Var(7) + 2Cov(X, Y) 

Note that, if X and Y are independent or uncorrelated, Var(X + Y) = Var(X) + Var(7). 
Higher Moments 

We have seen that the first and second moments are sufficient to describe the probability 
distribution of a Gaussian process. For a non-Gaussian process, some useful information 
about the probability density function of the process can be obtained by considering higher 
moments of the random variable. 

The generalized kth moment is defined as 


OO 

M' = E[X k ] = J 

— OO 


x k p(x)dx 


The k\h moment about the mean (central moment) is defined as 


(7.54) 


M k = E[(X - n x ) k ] = 


OO 

fix- p, x ) k p(x)dx 


— OO 


(7.55) 


In engineering, the third and fourth moments are widely used. For example, the third 
moment about the mean, E[(X — /x^-) 3 ], is a measure of asymmetry of a probability 
distribution, so it is called the skewness. In practice, the coefficient of skewness is more 
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often used, and is defined as 


Yi 


E[{X - au) 3 ] 


(7.56) 


Note that, in many texts, Equation (7.56) is simply referred to as skewness. Also note that 
Y\ = 0 for a Gaussian distribution since it has a symmetric probability density function. 
Typical skewed probability density functions are shown in Figure 7.19. Such asymmetry 
could arise from signal ‘clipping’. 


p(x ) 

/K pw 

1 \ 

1 

/ 1 \ 


/ I \ 

: V r _ / 

/ I \ 

1 

ol 

ft. 0 

A, 


(a) Negative skewness (b) Positive skewness 


Figure 7.19 Skewed probability density functions 


The fourth moment about the mean, E[{X — /r*) 4 ], measures the degree of flattening 
of a probability density function near its mean. Similar to the skewness, the coefficient 
of kurtosis (or simply the kurtosis) is defined as 


Y2 


E[{X - m,) 4 ] 


(7.57) 


where ‘—3’ is introduced to make yi — 0 f° r a Gaussian distribution (i.e. 
E[(X — [i x ) 4 ]/o x = 3 for a Gaussian distribution, thus £[(X — /r A ) 4 ]/cr 4 is often used 
and examined with respect to the value 3). 

A distribution with positive kurtosis > 0 is called leptokurtic (more peaky than 
Gaussian), and a distribution with negative kurtosis < 0 is called platykurtic (more 
flattened than Gaussian). This is illustrated in Figure 7.20. 



Figure 7.20 Probability density functions with different values of kurtosis 


Since yi = 0 and = 0 for a Gaussian process, the third and fourth moments (or 
y\ and yf) can be used for detecting non-Gaussianity. These higher moments may also be 
used to detect (or characterize) nonlinearity since nonlinear systems exhibit non-Gaussian 
responses. 

The kurtosis (fourth moment) is widely used as a measure in machinery condition moni- 
toring - for example, early damage in rolling elements of machinery often results in vibration 
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signals whose kurtosis value is significantly increased owing to the impacts occurring because 
of the faults in such rotating systems. 

As a further example, consider a large machine (in good condition) that has many compo- 
nents generating different types of (periodic and random) vibration. In this case, the vibration 
signal measured on the surface of the machine may have a probability distribution similar to 
a Gaussian distribution (by the central limit theorem). Later in the machine’s operational life, 
one of the components may produce a repetitive transient signal (possibly due to a bearing 
fault). This impact produces wide excursions and more oscillatory behaviour and changes 
the probability distribution from Gaussian to one that is leptokurtic (see Figure 7.20). The 
detection of the non-Gaussianity can be achieved by monitoring the kurtosis (see MATLAB 
Example 7.4). Note that, if there is severe damage, i.e. many components are faulty, then the 
measured signal may become Gaussian again. 


Computational Considerations of Moments (Digital Form) 

We now indicate some ways in which the moments described above might be estimated from 
measured data. No attempt is made at this stage to give measures of the accuracy of these 
estimates. This will be discussed later in Chapter 10. 

Suppose we have a set of data (x \ , X 2 , , Xn) collected from N measurements of a 
random variable X. Then the sample mean x (which estimates the arithmetic mean, fi x ) is 
computed as 


1 N 

x — — > x„ 

N ^ 


For the estimation of the variance o x , one may use the formula 


1 N 

= x (*" ~ x? 


(7.58) 


(7.59) 


However, this estimator usually underestimates the variance, so it is a biased estimator. Note 
that x is present in the formula, thus the divisor N — 1 is more frequently used. This gives an 
unbiased sample variance, i.e. 


= FrTf >"-* )2 


(7.60) 


where s x is the sample standard deviation. Since 


J2 (Xn - x) 2 = J2 xl - 2 Nx 2 + Nx 1 


the following computationally more efficient form is often used: 


1 

N — 1 


Y xl ) - Nx 2 


(7.61) 
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The above estimation can be generalized, i.e. the kth sample (raw) moment is defined as 


1 N 

'k ] N J2 X n 


(7.62) 


Note that m\ = x and m' 2 is the mean square value of the sample. Similarly the kth sample 
central moment is defined as 


1 N 

m k = — Y. ( x„ - xf 
N x—t 


(7.63) 


Note that m \ = 0 and m 2 is the (biased) sample variance. As in the above equation, the divisor 
N is usually used for the sample moments. For the estimation of skewness and kurtosis 
coefficients, the following biased estimators are often used: 


1 N / 

Skew = — Y (x„ - x) 3 / s 3 x 

n = 1 / 

Kurt = (x " ~ J)i j '") “ 3 


where the sample standard deviation 


s 


X 


1 



N 

( Xn - X) 2 

n=\ 


(7.64) 

(7.65) 


is used. 

Finally, for bivariate processes, the sample covariance is computed by either 

1 A _ 1 

= {x " ~ ~ y)= N 

n = 1 


y, X„ y„ J - Nxy 


(biased estimator) (7.66) 


or 




Y Xny ” ) ~ Nxy 


(unbiased estimator) (7.67) 


Note that, although we have distinguished the biased and unbiased estimators (the divisor is 
N or N — 1), their differences are usually insignificant if N is Targe enough’. 
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7.4 BRIEF SUMMARY 

1. The relative frequency (or empirical probability) of event E is 


2. A random variable is a function defined on a sample space, i.e. a random variable 
X maps from the sample space £2 to a range space £2x such that X{a>t) = x There 
are two types of random variable: a discrete random variable (£2x consists of discrete 
elements) and a continuous random variable (£2x consists of continuous values). 

3. The central limit theorem (roughly speaking) states that the sum of independent ran- 
dom variables (that have arbitrary probability distributions) S n = Ylk=i ^ k becomes 
normally distributed (Gaussian) as n gets large. 

4. The moments of a random variable are summarized in Table 7.1. 


Table 7.1 Summary of moments 


Moment (central) 


Estimator 

Measures 

1 st moment: 

X = 

i n 

= ' Vx, 

N i-t 

Mean (location) 

fix = E[X] 




2 nd moment: 


1 N 

= T {x„ — x ) 2 

N — 1 “ 

Variance (spread or dispersion) 

o 2 x = E[{X - /x,) 2 ] 




3rd moment: 

m 


Degree of asymmetry (skewness) 

M 3 = E[(X - p x f] 



E[(X-p x f] 

n = 3 

° x 

4th moment: 

M 4 


Degree of flattening (kurtosis) 

M 4 = E[(X - p x f] 



E[(X - p x ) 4 ] 

r>= 4 -3 


5. The correlation of X and Y is defined as E[X T], and the covariance of X and Y is 
defined as 

Cov(X, Y) = a xy = E[(X - ,i x ){Y - Mv )J 

These are related by 

Cov(X, Y) = E[X Y] - fJL x /JL y = E[X Y] - E[X]E[Y] 

6. Two random variables X and Y are uncorrelated if 

E[X Y] = E[X] E[Y] (or Cov(X, Y) = 0) 
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7. The correlation coefficient is defined as 

Cov(A, Y) E[{X - n x )(Y - n y )] 

Pxy — — 

® x®y &x Gy 

This is a measure of a linear relationship between two random variables. If | p xy | = 1 , 
then two random variables X and Y are ‘fully’ related linearly. If p xy = 0, they are 
not linearly related at all. 


7.5 MATLAB EXAMPLES 


Example 7.1: Relative frequency f E = n E /N as an estimate of P(E) 

In this MATLAB example, we consider an experiment of tossing a coin, and observe 
how the relative frequency changes as the number of trials ( N ) increases. 


Line 

MATLAB code 

Comments 

i 

clear all 

Initialize the random number generator. The 

2 

rand('state', 0 ); 

MATLAB function ‘rand’ generates 
uniformly distributed random numbers, while 
Tandn’ is used to generate normally 
distributed random numbers. 

3 

X=round(rand(l,1000)); % 1: head, 0: tail 

Define the random variable X whose 

4 

id_head=find(X== 1 ) ; id_tail=find(X==0) ; 

elements are either 1 or 0, and 1000 trials are 
performed. We regard 1 as the head and 0 as 
the tail. Find indices of head and tail. 

5 

N=ones(size(X)); 

The vector ‘head’ has ones that correspond to 

6 

head=N; head(id_tail)=0; 

the elements of vector X with 1 , and the 

7 

tail=N; tail(id_head)=0; 

vector ‘tail’ has ones that correspond to the 
elements of vector X with 0. 

8 

fr_head=cumsum(head)./cumsum(N); 

Calculate the relative frequencies of head and 

9 

fr_tail=cumsum(tail)./cumsum(N); 

tail. The MATLAB function ‘cumsum(N)’ 
generates a vector whose elements are the 
cumulative sum of the elements of N. 

10 

figure( 1) 


11 

plot(fr_head) 


12 

xlabel('\itN \rm(Number of trials)') 

Plot the relative frequency of head. 

13 

ylabel('Relative frequency (head)') 


14 

axis([0 length(N) 0 1]) 


15 

figure(2) 


16 

plot(fr_tail) 


17 

xlabel('\itN \rm(Number of trials)') 

Plot the relative frequency of tail. 

18 

ylabel('Relative frequency (tail)') 


19 

axis([0 length(N) 0 1]) 
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Results 




(a) 


(b) 


Comments: Note that the relative frequency approaches the theoretical probability (1/2) 
as N increases. 


Example 7.2: Demonstration of the central limit theorem 

The sum of independent random variables, S n = Ylk = l becomes normally distributed 

as n gets large, regardless of individual distribution of Xk . 


Line MATLAB code 


Comments 


1 clear all 

2 rand('state',l); 

3 X=rand( 1 0,5000) ; 


4 S1=X(1,:); 

5 S2=sum(X(l:2,:)); 

6 S5=sum(X(l:5,:)); 

7 S10=sum(X); 

8 nbin=20; N=length(X); 

9 [nl sl]=hist(Sl, nbin); 

10 [n2 s2]=hist(S2, nbin); 

11 [n5 s5]=hist(S5, nbin); 

12 [nlO s 1 0] =hist(S 1 0, nbin); 

13 figure(l) 

14 bar(sl,nl/N) 

15 xlabel('\itS\rm_l') 

16 ylabel('Relative frequency'); 
axis([0 1 0 0.14]) 


Initialize the random number generator, and generate 
a matrix X whose elements are drawn from a 
uniform distribution on the unit interval. 

The matrix is 10x5000; we regard this as 10 
independent random variables with a sample length 
5000. 

Generate the sum of random variables, e.g. S5 is the 
sum of five random variables. 

In this example, we consider four cases: SI, S2, S5 
and S10. 


Define the number of bins for the histogram. Then, 
calculate the frequency counts and bin locations for 
SI, S2, S5 and S 10. 


Plot the histograms of SI, S2, S5 and S10. A 
histogram is a graph that shows the distribution of 
data. In the histogram, the number of counts is 
normalized by N. 
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17 figure(2) 

18 bar(s2, n2/N) Examine how the distribution changes as the number 

19 xlabel('\itS\rm_2') of sum n increases. 

20 ylabel('Relative frequency'); 
axis([0 2 0 0.14]) 

21 figure(3) 

22 bar(s5, n5/N) 

23 xlabel('\itS\rm_5') 

24 ylabel('Relative frequency'); 
axis([0.4 4.7 0 0.14]) 

25 figure(4) 

26 bar(sl0, nlO/N) 

27 xlabel('\itS\rm_l_0') 

28 ylabel('Relative frequency'); 
axis([ 1.8 8 0 0.14]) 


Results 



Si 

(a) Number of sums: 1 



s 2 

(b) Number of sums: 2 


0.14 
0.12 
^ o.i 

S' 0.08 
4a 

0.06 
* 0.04 
0.02 
0 



(c) Number of sums: 5 


(d) Number of sums: 10 


Comments: Note that it quickly approaches a Gaussian distribution. 
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Example 7.3: Correlation coefficient as a measure of the linear relationship between 
two random variables X and Y 

Consider the correlation coefficient (i.e. Equation (7.47)) 

Cov(X, Y) E[{X - li x )(Y - Mv )l 
Pxy — — 

(J x Gy °jc "v 

We shall compare three cases: (a) linearly related, \p xy \ = 1; (b) not linearly related, 
\p xy \ = 0; (c) partially linearly related, 0 < |p. lv | < 1. 


Line 

MATLAB code 

Comments 

i 

clear all 

Initialize the random number generator. 

2 

randn('state' ,0); 

and define a random variable X. 

3 

X=randn( 1 , 1 000) ; 

Then, define a random variable Y 1 that is 

4 

a=2; b=3; Yl=a*X+b; % fully related 

linearly related to X, i.e. Y1 = aX+b. 

5 

Y2=randn( 1,1000); % unrelated 

Define another random variable Y2 

6 

Y3=X+Y2; % partially related 

which is not linearly related to X. Also, 
define a random variable Y3 which is 
partially linearly related to X. 

7 

N=length(X); 


8 

s_xy 1 =sum((X-mean(X)). * 

Calculate the covariance of two random 


(Yl-mean(Yl)))/(N-l); 

variables, Cov(X, Yl), Cov(X, Y2) and 

9 

s_xy2=sum((X-mean(X)). * 

Cov(X, Y3). 


(Y2-mean(Y2)))/(N-l); 

See Equation (7.67) for a computational 

10 

s_xy3=sum((X-mean(X)).* 

(Y3-mean(Y3)))/(N-l); 

formula. 

11 

r_xyl=s_xyl/(std(X)*std(Yl)) 

Calculate the correlation coefficient for 

12 

r_xy2=s_xy2/(std(X)*std(Y2)) 

each case. The results are: 

13 

r_xy3=s_xy3/(std(X)*std(Y3)) 

r_xyl = 1 (fully linearly related), 
r_xy2 = —0.0543 (~ 0, not linearly 
related), 

r_xy3 = 0.6529 (partially linearly related). 

14 

figure(l) 

The degree of linear relationship between 

15 

plot(X,Yl, '.') 

two random variables is visually 

16 

xlabel('\itX'); ylabel('\itY\rml') 

demonstrated. 

First, plot Yl versus X; this gives a 
straight line. 

17 

figure(2) 

Plot Y2 versus X; the result shows that 

18 

plot(X,Y2, '.') 

two random variables are not related. 

19 

xlabel('\itX'); ylabel('\itY\rm2') 


20 

figure(3) 

Plot Y3 versus X; the result shows that 

21 

plot(X,Y3, '.') 

there is some degree of linear 

22 

xlabel('\itX'); ylabel('\itY\rm3') 

relationship, but not fully related. 
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Results 



Example 7.4: Application of the kurtosis coefficient to the machinery condition 
monitoring 


Kurtosis coefficient : yn = : — ; 3 

In this example, we use a ‘real" measured signal. Two acceleration signals are stored 
in the file ‘bearing_fault.mat’: 1 one is measured on a rotating machine in good working 
order, and the other is measured on the same machine but with a faulty bearing that results 
in a series of spiky transients. Both are measured at a sampling rate of 10 kHz and are 
recorded for 2 seconds. The signals are then high-pass filtered with a cut-on frequency 
of 1 kHz to remove the rotating frequency component and its harmonics. 

Since the machine has many other sources of (random) vibration, in ‘normal’ 
condition, the high-pass-filtered signal can be approximated as Gaussian, thus the 
kurtosis coefficient has a value close to zero, i.e. y 2 & 0. 


The data files can be downloaded from the Companion Website (www.wiley.com/go/shin_hammond). 
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However, if the bearing is faulty, then the signal becomes non-Gaussian due to the 
transient components in the signal, and its distribution will be more peaky (near its mean) 
than Gaussian, i.e. y 2 > 0 (leptokurtic). 


Line 

MATLAB code 

Comments 

i 

clear all 

Load the measured signal, and let x be 

2 

load bearing_fault 

the signal in good condition, y the 

3 

x=br_good; y=br_fault; 

signal with a bearing fault. 

4 

N=length(x); 


5 

kur_x=(sum((x-mean(x)).~4)/N)/(std(x, 1 )~4)-3 

Calculate the kurtosis coefficients of 

6 

kur_y=(sum( (y-mean(y)).~4)/N)/(std(y, 1 )“4)-3 

both signals (see Equation (7.65)). 
The results are: kur_x = 0.0145 (i.e. 
Y 2 & 0) and kur_y = 1.9196 (i.e. 
leptokurtic). 

7 

[nx xl]=hist(x,31); 

Calculate the frequency counts and bin 

8 

[ny yl]=hist(y,31); 

locations for signals x and y. 

9 

figured); subplot(2,l,l) 


10 

plot(t.x) 


11 

xlabel('Time (s)'); ylabel('\itx\rm(\itt\rm)') 

Plot the signal x, and compare with the 

12 

subplot(2,l,2) 

corresponding histogram. 

13 

bar(xl, nx/N) 


14 

xlabel('\itx'); ylabel('Relative frequency') 


15 

axis([-l 1 0 0.2]) 


16 

figure(2); subplot(2,l,l) 


17 

plot(t,y) 


18 

xlabel('Time (s)'); ylabel('\ity\rm(\itt\rm)') 

Plot the signal y, and compare with the 

19 

subplot(2,l,2) 

corresponding histogram. Also 

20 

bar(yl, ny/N) 

compare with the signal x. 

21 

xlabel('\ity'); ylabel('Relative frequency') 


22 

axis([-l 1 0 0.2]) 



Results 



Time (s) x 

(a) Signal measured on a machine in good condition, y 2 = 0.0145 
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(b) Signal measured on a machine with a bearing fault, y 2 = 1.9196 


Comments: In this example, we have treated the measured time signal as a random vari- 
able. Time-dependent random variables (stochastic processes) are discussed in Chapter 8. 


8 

Stochastic Processes; Correlation 
Functions and Spectra 


Introduction 

In the previous chapter, we did not include ‘time’ in describing random processes. We 
shall now deal with measured signals which are time dependent, e.g. acoustic pressure 
fluctuations at a point in a room, a record of a vibration signal measured on a vehicle 
chassis, etc. In order to describe such (random) signals, we now extend our considerations 
of the previous chapter to a time-dependent random variable. 

We introduce this by a simple example. Let us create a time history by tossing a coin 
every second, and for each ‘head’ we record a unit value and for each ‘tail’ we record a 
zero. We hold these ones and zeros for a second until the next coin toss. A sample record 
might look Figure 8.1. 


40 


1 

0 



Is 



t 


Figure 8.1 A sample time history created from tossing a coin 


The sample space is (H, T), the range space for X is (1, 0) and we have introduced 
time by parameterizing X(a>) as X t (a>), i.e. for each t, X is a random variable defined on 
a sample space. Now, we drop a> and write X(t), and refer to this as a random function 
of time (shorthand for a random variable defined on a sample space indexed by time). 


Fundamentals of Signal Processing for Sound and Vibration Engineers 
K. Shin and J. K. Hammond. © 2008 John Wiley & Sons, Ltd 
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Figure 8.2 An example of an ensemble 


We shall carry over the ideas introduced in the last chapter to these time series which 
display uncertainty referred to as stochastic processes. The temporal aspects require us to 
bring in some additional definitions and concepts. 

Figure 8.1 depicts a single ‘realization’ of the stochastic process X(t ) (obtained by the 
coin tossing experiment). It could be finite in length or infinite, i.e. — oo < t < oo. Its random 
character introduces us to the concepts (or necessity) of replicating the experiments, i.e. 
producing additional realizations of it, which we could imagine as identical experiments run 
in parallel as shown in Figure 8.2. 

The set of such realizations is called an ensemble (whether finite or infinite). This is 
sometimes written as (X(0) where — oo < t < oo. 


8.1 PROBABILITY DISTRIBUTION ASSOCIATED WITH A 
STOCHASTIC PROCESS 

We now consider a probability density function for a stochastic process. Let x be a 
particular value of X(t)\ then the distribution function at time f is defined as 

F(x,t)= P[X(t) <x] (8.1) 

and 

P[x < X(t) < x + Sx ] = F(x + Sx, t ) — F(x, t ) (8.2) 

Since 

P[x < X(t) < x + c5x] F(x + Sx, t ) — F(x, t ) dF{x, t) 

lim = lim = 

fa->o Sx o Sx dx 
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the probability density function can be written as 


p(x, t) 


dF{x, t) 
dx 


(8.3) 


Note that the probability density function p{ x, t) for a stochastic process is time depen- 
dent, i.e. it evolves with time as shown in Figure 8.3. 



Figure 8.3 Evolution of the probability density function of a stochastic process 


Alternatively, we may visualize this as below. We project the entire ensemble onto 
a single diagram and set up a gate as shown in Figure 8.4. 



Now, we count the number of signals falling within the gate (say, k). Also we count 
the total number of signals (say, N). Then the relative frequency of occurrence of X(t) in 
the gate at time t is k/N. So, as N gets large, we might say that P[x < X(t) < x + 5x] 
is estimated by k/N (for large N), so that 


p(x, t) = lint 

Sx-> o 


P[x < X(l) < x + <5x] 
Sx 


lim 

Sx-*0 


k 

NSx 


(8.4) 


It is at this point that the temporal evolution of the process introduces concepts additional 
to those in Chapter 7. We could conceive of describing how a process might change as time 
evolves, or how a process relates to itself at different times. We could do this by defining joint 
probability density functions by setting up additional gates. 

For example, for two gates at times ?i and ti this can be described pictorially as in Figure 
8.5. Let k 2 be the number of signals falling within both gates in the figure. Then, the relative 
frequency k 2 /N estimates the joint probability for large N, i.e. 


P[x i < X(t\) < X\ + Sxj fl X2 < X(t 2 ) < X2 + Sxt] 


ki 

N 


(8.5) 
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Thus, the joint probability density function is written as 

P[X l < X(fj) < X\ + &X\ Cl X 2 < X(t 2 ) < X2 + 5X2] 


p(x u t l ;x 2 , t 2 ) = lint 

5jci ,«5jc2 — >0 


5xi<5x2 


k 2 


— lint 

NSxiSx 2 


( 8 . 6 ) 


Also, the joint distribution function is F(xi, t\\x 2 , t 2 ) — P[X{t\) < x\ fl X(t 2 ) < x 2 ], so 
Equation (8.6) can be rewritten as 


P(x i , tr,x 2 , t 2 ) = 


3 2 F(xi, h\x 2 , t 2 ) 
dx\dx 2 


(8.7) 


For a ‘univariate’ stochastic process, Equation (8.7) can be generalized to the ^th-order 
joint probability density function as 

p(xi, fi;x 2 , t 2 \ . . . ;x k , t k ) (8.8) 

However, we shall only consider the first and second order, i.e. p(x, t) and p(xi, t\\x 2 , t 2 ). 


8.2 MOMENTS OF A STOCHASTIC PROCESS 

As we have defined the moments for random variables in Chapter 7, we now define 
moments for stochastic processes. The only difference is that ‘time’ is involved now, i.e. 
the moments of a stochastic process are time dependent. The first and second moments 
are as follows: 

(a) First moment (mean): 

OO 

MO-™.)] =/«■<*•>» <8 - 9 > 

— OO 

(b) Second moment (mean square): 

OO 

E[X 2 (t)] = J x 2 p(x,t)dx (8.10) 

— OO 
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(c) Second central moment (variance): 

Var (X(0) = <r 2 x (t) = E[(X(t ) - MO) 2 ] = 


OO 

J (x — p x (t)) 2 p(x, t)dx 


— OO 

Note that E[(X(t) — p x (t)) 2 ] = E\X 2 (tf\ — p, 2 (t), i.e. 

a 2 x {t) = E[X\t )] - p 2 x (t) 


( 8 . 11 ) 


( 8 . 12 ) 


Ensemble Averages 


We noted the concept of the ensemble earlier, i.e. replications of the realizations of the process. 
We now relate the expected value operator E to an ensemble average. Consider the ensemble 
shown in Figure 8.6. 

Then, from the ensemble, we may estimate the mean by using the formula 



(8.13) 


We now link Equation (8. 13) to the theoretical average as follows. First, for a particular time 
t, group signals according to level (e.g. the gate defined by x and x + 5x). Suppose all X,-(f) 
in the range xi and xi + Sxi are grouped and the number of signals in the group is counted 
(say ki). Then, repeating this for other groups, the mean value can be estimated from 

k\ k 2 x — ' k; 

x(f) w Xl FX2— + ■■•=/ X; — (8.14) 

N N “ N 


where k-J N is the relative frequency associated with the zth gate (x,- to x,- + 5x,). Now, as 
N -*■ oo, ki/N — >• p(Xj, t)8xi, so 



t 


Time 


Time 


Time 


(8.15) 


Figure 8.6 An example of ensemble average 
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Thus, an average across the ensemble (the infinite set) is identified with the theoretical average, 
MO, i.e. 


1 N 

p, x (t) = E[X(t)] = lim V X n (t) 

N—>oo N z -' 


(8.16) 


So, the operator E[ ] may be interpreted as the .Expectation or Ensemble average. 


8.3 STATIONARITY 

As we have seen in previous sections, the probability properties of a stochastic process 
are dependent upon time, i.e. they vary with time. Flowever, to simplify the situation, we 
often assume that those statistical properties are in a ‘steady state’ , i.e. they do not change 
under a shift in time. For example: 

(a) p{x, t) = p(x). This means that p x (t) = p x and a 2 {t) = a 2 , i.e. the mean and vari- 
ance are constant. 

(b) p{x i, ?i;x 2 , tf) = p{x i, t\ + E;x 2 , ?2 + T), i.e. p{x i, tp,x 2 , tf) is a function of time 
difference fe — t\ ) only, and does not explicitly depend on individual times t\ and t 2 - 

(c) p{xi, h;x 2 , h\ ...;x k , t k ) - p{x 1 , fi + T;x 2 , t 2 + T; . . . \x k , t k + T) for all k. 

If a process satisfies only two conditions (a) and (b), then we say it is weakly stationary 
or simply stationary. If the process satisfies the third condition also, i.e. all the fcth- 
order joint probability density functions are invariant under a shift in time, then we say 
it is completely stationary. In this book, we assume that processes satisfy at least two 
conditions (a) and (b), i.e. we shall only consider stationary processes. Typical records 
of non-stationary and stationary data may be as shown in Figure 8.7. 


X(t), 

.aM 


Non-stationary 

x(t). 

VVVVXAyVvA . * 

(varying mean) v \rV\/ 

Non-stationary ( 

x(0. 

A./v/v/A aa 

' |U vyv*./'.- » 

varying variance) 

J\rv *\j\j 

‘Probably’ 

V/ v \S v "y vv vy \j' y\J\l\]\J\-,+ 

stationary 


Figure 8.7 Typical ‘sample’ of non-stationary and stationary processes 
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In general all practical processes are non-stationary, thus the assumption of stationarity 
is only an approximation. However, in many practical situations, this assumption gives a 
sufficiently close approximation. For example, if we consider a vibration signal measured on 
a car body when the car is driven at varying speeds on rough roads, then the signal is obviously 
non-stationary since the statistical properties vary depending on the types of road and speed. 
However, if we locate a road whose surface is much the same over a ‘long’ stretch and drive 
the car over it at constant speed, then we might expect the vibration signal to have similar 
characteristics over much of its duration, i.e. ‘approximately’ stationary. 

As we shall see later, the assumption of stationarity is very important, especially when 
we do not have an ensemble of data. In many situations, we have to deal with only a single 
record of data rather than a set of records. In such a case, we cannot perform the average 
across the ensemble , but we may average along time , i.e. we perform a time average instead of 
ensemble average. By implication, stationarity is a necessary condition for the time average to 
be meaningful. (Note that, for stationary processes, the statistical properties are independent 
of time.) The problem of deciding whether a process is stationary or not is often difficult and 
generally relies on prior information, though observations and statistical tests on time histories 
can be helpful (Priestley, 1981; Bendat and Piersol, 2000). 


8.4 THE SECOND MOMENTS OF A STOCHASTIC PROCESS; 
COVARIANCE (CORRELATION) FUNCTIONS 


The Autocovariance (Autocorrelation) Function 

As defined in Equation (8.11), the variance of a random variable for a stochastic process 
is written cr 2 (f) = E [(A(t) — /r x (0) 2 ]. However, a simple generalization of the right 
hand side of this equation introduces an interesting concept, when written as £[(X(fi) — 
li x (.h))(X(t 2 ) — /Arfe))]- This is the autocovariance function defined as 

C„(/t, h) = E[(X( tl ) - n x (h))(X(t 2 ) - n x (t 2 ))] (8.17) 

Similar to the covariance of two random variables defined in Chapter 7, the autocovariance 
function measures the ‘degree of association’ of the signal at time t\ with itself at time 
t 2 . If the mean value is not subtracted in Equation (8.17), it is called the autocorrelation 
function as given by 

R xx (t u t 2 ) = E[X(ti)X(t 2 )] (8.18) 

Note that, sometimes, the normalized autocovariance function, C xx (t\,ti)l 
[<y x (ti)cr x (t 2 )], is called the autocorrelation function, and it is also sometimes called an 
autocorrelation coefficient. Thus, care must be taken with the terminology. 

If we limit our interest to stationary processes, since the statistical properties remain 
the same under a shift of time, Equation (8.17) can be simplified as 

C xx (t 2 - h) = E[(X(h) - n x )(X(t 2 ) - ii x )] (8.19) 

Note that this is now a function of the time difference ( t 2 — t\) only. By letting q = t and 
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t 2 = t + r, it can be rewritten as 


C xx (r) = E[(X(t) - n x )(X(t + r) - p. x )] 

(8.20) 

where r is called the lag. Note that when r = 0. C xx (0) = Var(X(l)) = 
autocorrelation function for a stationary process is 

o' 2 . Similarly, the 

R xx (r) = E[X(t)X(t + r)] 

(8.21) 

Note that R xx ( r) is a continuous function of r for a continuous stochastic process, and 
C xx (r) and R xx ( r) are related such that 

P 

X 

II 

>3 

'h'' 

1 

X K> 

(8.22) 


Interpretation of the Autocorrelation Function in Terms of the Ensemble 

In Section 8.2, we have already seen that the mean value might be defined as an ensemble 
average (see Equation (8.16)), i.e. 


hxQ) = 


1 

lim — 

N— ► oo N 


N 
n = 1 


We now apply the same principle to the autocorrelation function for a stationary process. For 
simplicity, we assume that the mean value is zero, i.e. we set p x = 0. 

For the nth record, we form X n (t)X„(t + r) as shown in Figure 8.8, and average this 
product over all records, i.e. an ensemble average. 

Then, we can write the autocorrelation function as 


R xx (r) = E[X(t)X(t + r)] = lim -J- 

N-+oo N 


N 

J2 X n(t)X n (t + T) 
n= 1 


(8.23) 


Since we assumed that fi x — 0, the autocorrelation function at zero lag is R xx ( 0) = 
Var(X(r)) = a 2 . Also, as x increases, it may be reasonable to say that the average 



t ' t + T 


Form Xi(t)Xi(t + r) 
Time 

Form X 2 (t)X 2 (t + r) 
Time 


Form X n (t)X n (t + r) 
Time 


Figure 8.8 Ensemble average for the autocorrelation function 
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RJr) 

r <j\ (for zero mean) 



Figure 8.9 A typical autocorrelation function 


E[X(t)X(t + r)] should approach zero, since the values of X(t) and X(t + r) for large lags 
(time separations) are ‘less associated (related)’ if the process is random. Thus, the gen- 
eral shape of the autocorrelation function R xx ( x) may be drawn as in Figure 8.9. Note that, 
as can be seen from the figure, the autocorrelation function is an even function of r since 
E[X(t)X(t + r)] = E[X(t - x)X(t)]. 

We note that the autocorrelation function does not always decay to zero. An example of 
when this does not happen is when the signal has a periodic form (see Section 8.6). 


The Cross-covariance (Cross-correlation) Function 

If we consider two stochastic processes {X(0} and (F(r)} simultaneously, e.g. an 
input-output process, then we may generalize the above joint moment. Thus, the cross- 
covariance function is defined as 

Cxy(fu t 2 ) = E[(X(ti) - / i x (h))(Y(t 2 ) - tiy(t 2 ))] (8.24) 

and, if the mean values are not subtracted, the cross-correlation function is defined as 

R X y(h,t 2 ) = E[X(t l )Y(t 2 )] (8.25) 

Equation (8.24) or (8.25) is a measure of the association between the signal X(t) at time 
t\ and the signal Y(t) at time t 2 , i.e. it is a measure of cross-association. If we assume 
both signals are stationary, then C xy (t\, t 2 ) or R xy (t\ , t 2 ) is a function of time difference 
t 2 — t\. Then, as before, letting t\ = t and t 2 = t + r, the equations can be rewritten as 


C X y(x) = E[(X(t) - n x )(Y(t + r) - Hy)] 

(8.26) 

and 



R xy ( x) = E[X(t)Y(t + x)] 

(8.27) 

where their relationship is 



C xy (x) - R xy (x) 


(8.28) 

Also, the ensemble average interpretation becomes 


R xy (x) = E[X(t)Y(t + r)] = lim -J- 

N-v oo N 

IS 

Y J X„(t)Y„(t + X) 
_n= 1 

(8.29) 
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We shall consider examples of this later, but note here that R xy ( r) can have a general 
shape (i.e. neither even, nor odd) and R xy ( r) = R yx (— x). 

The cross-correlation (or cross-covariance) function is one of the most important 
concepts in signal processing, and is applied to various practical problems such as 
estimating time delays in a system: radar systems are a classical example; leak detection 
in buried plastic pipe is a more recent application (Gao et al., 2006). Moreover, as we 
shall see later, together with the autocorrelation function, it can be directly related to the 
system identification problem. 


Properties of Covariance ( Correlation ) Functions 

We now list some properties of covariance (correlation) functions; the examples to follow in 
Section 8.6 will serve to clarify these properties: 

(a) The autocovariance (autocorrelation) function: First, we define the autocorrelation coef- 
ficient as 

, n C xx (x) / R X x(x) , \ 

Pxx(r) = r— = — - — for zero mean (8.30) 

a 2 \ Rxx( 0) ) 

where p xx ( r) is the normalized (non-dimensional) form of the autocovariance function. 

(i) C xx (r) — C xx ( — T ) ; R xx ( t) = R xx (—x) (i.e. the autocorrelation function is ‘even’) 

(8.31) 

(ii) p xx ( 0) = 1; C xx (0) = cr 2 (= R xx ( 0) for zero mean) (8.32) 

(hi) \C xx (t)\ < a 2 - |i?^(r)| < R xx ( 0), thus - 1 < p xx ( x) < 1 (8.33) 

Proof: E[(X(t) ± X(t + t)) 2 ] = E[X 2 (t ) + X 2 (t + r) ± 2 X(t)X(t + r)] > 0, thus 
2R XX (0) > 2 |/? vv (r)| which gives the above result. 

(b) The cross-covariance (cross-correlation) function: We define the cross-correlation coef- 
ficient as 

p xy ( x) = ^ (= '' ^ ^ for zero mean') (8.34) 

«X<Ty \ fR XX (0)R yy (0) ) 

(i) C xy (— r) = C yx (x); R xy (— x) = R yx (x) (neither odd nor even in general) (8.35) 

(ii) |Ctj,(r)| 2 < ty 2 a 2 \\R xy (x)\ 2 < R xx (0)R xy (Q), thus - 1 < p xy (r) < 1 (8.36) 

Proof: For real values a and b, 

E[(aX(t) + bY(t + r)) 2 ] = E[a 2 X 2 (t) + b 2 Y 2 (t + t) + 2 abX(t)Y(t + r)] > 0 
i.e. a 2 R xx (0) + 2abR xy (x) + b 2 R yy ( 0) > 0, or if b fx 0 

(a/b) 2 R xx ( 0) + 2(a/b)R xy (x) + R yy ( 0) > 0 
The left hand side is a quadratic equation in a /b, and this may be rewritten as 
\R X x(0)(a/b) + R xy (x)] 2 > R 2 xy ( x) - R xx (0)R yy (0) 
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For any values of a/b, this inequality must be satisfied. Thus 

< v (r) - R xx (m yy (V) < 0 


and so the result follows. 

(iii) If X(t) and Y(t) are uncorrelated, C xy (x) = 0; R xy ( r) = ix x /x y . 


Note that the above correlation coefficients are particularly useful when X(t) and Y{t) 
have different scales. Although we have distinguished the covariance functions and correlation 
functions, their difference is the presence of mean values only. In most practical situations, 
the mean values are usually subtracted prior to some processing of data, so the correlation 
functions and the covariance functions are the same in effect. Consequently, the ‘correlation’ 
functions are often preferably used in engineering. 


8.5 ERGODICITY AND TIME AVERAGES 


The moments discussed in previous sections are based on the theoretical probability 
distributions of the stochastic processes and have been interpreted as ensemble averages, 
i.e. we need an infinite number of records whose statistical properties are identical. 
However, in general, ensemble averaging is not feasible as we usually have only a single 
realization (record) of limited length. Then, the only way to perform the average is along 
the time axis, i.e. a time average may be used in place of an ensemble average. The 
question is: do time averages along one record give the same results as an ensemble 
average? The answer is ‘sometimes’, and when they do, such averages are said to be 
ergodic. 

Note that we cannot simply refer to a process as ergodic. Ergodicity must be related 
directly to the particular average in question, e.g. mean value, autocorrelation function 
and cross-correlation function, etc. Anticipating a result from statistical estimation theory, 
we can state that stationary processes are ergodic with respect to the mean and covariance 
functions. Thus, for example, the mean value can be written as 


1 

= T llm Y 

T —>oo 1 


T 

J x(t)dt 
0 


(8.37) 


i.e. the time average over any single time history will give the same value as the ensemble 
average E[X(t)\. 

If we consider a signal with a finite length T, then the estimate of the mean value 
can be obtained by 


Mx — % 


T 

- J x(t)dt 

o 


(8.38) 
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or if the signal is digitized using samples every A seconds so that T — N A, then 


thus 


j JV-l 

x = V x(nA)A 

N A ^ 


j N - 1 

x — - ) x(n A) 

N n 
«=0 


(8.39) 


Note that the mean value x is a single number characterizing the offset (or d.c. level) as being 
the same over the whole signal. 

If the offset changes at some point (i.e. a simple type of non- stationary signal), e.g. at 
t = 7j as shown in Figure 8.10, then the ‘estimate’ of the mean value using all T seconds will 
produce a mean for the whole record - whereas it might have been preferable to split up the 
averaging into two segment to obtain x\ and X 2 - 



This idea may be generalized to estimate a ‘drifting’ or ‘slowly varying’ mean value 
by using local averaging. The problem with local averaging (or local smoothing) is that, of 
necessity, fewer sample values are used in the computation and so the result is subject to more 
fluctuation (variability). Accordingly, if one wants to ‘track’ some feature of a non-stationary 
process then there is a trade-off between the need to have a local (short) average to follow the 
trends and a long enough segment so that sample fluctuations are not too great. The details of 
the estimation method and estimator errors will be discussed in Chapter 10. 

Similar to the mean value, the estimate of time-averaged mean square value is (we follow 
the notation of Bendat and Piersol, 2000) 


T 

x2 = ^x = Y j x 2 (t)dt 

0 


and in digital form 


(8.40) 


fx = x 12 x2 ( nA) 


(8.41) 
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The root mean square (rms) is the positive square root of this. Also, the variance of the signal 
can be estimated as 


T 

d x=Y / ~ x> 2(it 

0 


(8.42) 


In digital form, the unbiased estimator is 


= J2 (x( " A) - X? 


(8.43) 


where a x is the estimate of the standard deviation. 

For the joint moments, the ensemble averages can also be replaced by the time averages 
if they are ergodic such that, for example, the cross-covariance function is 


C X y(j) = 


lim — 

T— >oo T 


T 

J U(0 - /AcXyO + r) - Hy)dt 

o 


(8.44) 


i.e. the time average shown above is equal to E[(X(t) — + r) — pL y )\ and holds for 

any member of the ensemble. The (unbiased) estimate of the cross-covariance function is 


T-z 

C xy ( r) = T _ ^ J ( x(t ) - x)(y(t + r) - y)dt 0 < x < T 
o 

T 

= — — — f (.x(t) - x)(y(t + r) - y)dt -T < r < 0 
T - |T| J 

|t| 


(8.45) 


In Equation (8.45), if the divisor T is used, it is called the biased estimate. Since C xy { r) = 
C yx (—T) we may need to define the C xy (z) for positive r only. The corresponding digital 
version can be written as 


j N—m—l 

C xy (mA) = — V (x{n A) - x)(y((n + w) A) - y) 

N-m 


0 < m < N 


(8.46) 


In this section, we have only defined unbiased estimators. Other estimators and corre- 
sponding errors will be discussed in Chapter 10. Based on Equations (8.44)-(8.46), the same 
form of equations can be used for the autocovariance function C xx { r) by simply replacing y 
with x, and for correlation functions R xx ( r) and R xy ( r) by omitting the mean values in the 
expressions. 
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8.6 EXAMPLES 

We now demonstrate several examples to illustrate probability density functions and covariance 
(correlation) functions. 


Probability Distribution of a Sine Wave v,xl 


A sine wave x(t) = A sin(tnt + 9) may be considered random if the phase angle 9 is random, 
i.e. 9 is now a random variable, and so each realization has a phase drawn from some probability 
density function p(9) (A and a> are known constants). 

For a fixed value of t we shall compute the probability density function of x . To do this, we 
work from the first principles. We want p(x, t) = dF(x, t)/dx where F(x, t ) = / ) [X(t) < x]. 
Let us first calculate F(x, t ) and then differentiate with respect to x. We shall assume that 


p(0) = 


1 

2n 


for 0 5 9 < 2n 


= 0 otherwise 


(8.47) 


i.e. 9 is uniformly distributed. Then the distribution function is 


F(9) = 


0 9 < 0 

9 

— 0 <9 <2n 

2n 


1 9 >2n 


(8.48) 


Since it is a stationary process, an arbitrary value of t can be used, i.e. 
p{x , t) = p(x), so let t = 0 for convenience. Then 

Fix) = P[X <x]= P[A sin 9 < x] = P fsin (9 < - 

L A 

This condition is equivalent to 

p [ 9 < (^)] = F ( 6 , )le= S in -V/A) ™ d 


Fix, t) = F(x) and 

(8.49) 

(8.50) 


P\n - sin 1 < 9 < 2tt] = F(9) \ e=2n - F(9) | s=jr _ sin -i (jcM) 


Note that, in the above ‘equivalent’ probability condition, the values of sin _1 (x/A) are de- 
fined in the range — 71 / 2 to jr/2. Also F(9) \ e is allowed to have negative values. Then, the 
distribution function Fix , t) becomes 

Fix, t ) = Fix) = i + - sin- 1 (4) (8.51) 

2 7 r \AJ 

and this leads to the probability density function as 


pix) = 


dFix) 

dx 


1 1 

Tt -J A 1 - x 2 


(8.52) 


which has the U shape of the probability density function of a sine wave as shown in Figure 8.11. 
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P(A F(x) 



-A 0 A -A 0 A 

Figure 8.11 Probability density function and distribution function of a sine wave 


As an alternative method, we demonstrate the use of a time average for the probability 
density calculation (assuming ‘ergodicity’ for the probability density function). Consider 
the sine wave above (and set the phase to zero for convenience, i.e. 0 = 0 for a particular 
realization). Then, we set up agate (x < x(t) < x + dx) as shown in Figure 8.12, and evaluate 
the time spent within the gate. Then 


p(x)dx & probability of lying in the gate 


T. d, i 

T 


where T = the total record length, and ^dt, = total time in the gate. 


(8.53) 



For the sine wave, we take one period, i.e. T — Tp. Since dx = coA cos(a>t)dt, it follows 

that 


dt = 


dx 

aiA cos(wt) 


dx 

wAj 1 - ( x/A ) 2 


(8.54) 


Also, let dt\ = dt 2 = dt, so 


2 dt 2 dx 

p(x)dx = = — 

Tp (2tt /oi)cl>AxJ\ — (x/A) 1 


which is the same result as Equation (8.52), i.e. 


dx 

Tt\/ A 2 — x 2 


1 

Tt\! A 2 — X 2 


(8.55) 


p(x) = 


234 


STOCHASTIC PROCESSES; CORRELATION FUNCTIONS AND SPECTRA 


The Autocorrelation (Autocovariance) Function 
A Sine Wave M8 2 

Let x(t) = A sin(a)t + 9) where 6 is a random variable with a uniform probability density 
function as discussed above. For any fixed value of t, let 

x(t) = A sin(ft>f + 9) — x\ (9) 
x(t + r) = A sin [a>(t + r) + 9] = X 2 (9) 

The mean value is 


p x = E [x(t ) ] = E [.vi(6>)] = / A sin(wt + 9)p(9)d0 = 0 


OO 

/ 


(8.56) 


(8.57) 


Then the autocorrelation function becomes 

R xx (r) = E [x(t)x(t + r)] = E [x l (9)x 2 (9)] 


/ 


= / A 2 sin(cot + 0) sin [co(t + r) + 6] p(6)d0 


2n 

a 2 r i 


/ 


= — / - [cos(a>T) — cos(2 o)t + cor + 2 0)]d0 
2 tt ^ 


A 2 

= — cos(oit) 


(8.58) 


which is a cosine function as shown in Figure 8.13. Note that this is an example where the 
autocorrelation does not decay to zero as r — ► oo. 

Assuming ‘ergodicity’ for the autocorrelation function, the time average for a single trace 
x(t) = A sin(a>f + 9) gives the same result: 


Rxx( T) = lint 

T — >oo I 


= lim 

t >00 2T 


ff 

0 

l -f 

T J 


T T 

x(t)x(t + x)dt = lim — / 

r-»-oo 2 T J 

-T 


x(t)x(t + r )dt 


A 1 sin(ft)t + 9) sin(aif + an + 9)dt 


-T 

A 2 f 1 A 2 

= lim — 1 - [cos(iz)t) — cos(2 cot + cor + 2 9 )]dt = — cos(&jt) (8.59) 
T-* 00 2 T J 2 2 


A 2 “ 



Figure 8.13 Autocorrelation function of a sine wave 


EXAMPLES 


235 


It is appropriate to emphasize the meaning of the autocorrelation function by considering 
R xx ( t) as the ‘average’ of the product x(t)x(t + r). The term x(t + r) is a shifted version of 
x(t), so this product is easily visualized as the original time history x(t) which is ‘overlaid’ by 
a shifted version of itself (when r is positive the shift is to the left). So the product highlights 
the ‘match’ between x(t) and itself shifted by r seconds. 

For the sine wave, which is periodic, this matching is ‘perfect’ for r = 0. When r = 1/4 
period (i.e. x = n/2oi), the positive and negative matches cancel out when integrated. When 
t = 1/2 period this is perfect matching again but with a sign reversal, and so on. This shows 
that the periodic signal has a periodic autocorrelation function. Note that, as can be seen in 
Equation (8.59), the autocorrelation (autocovariance) function does not depend on 6, i.e. it is 
‘phase blind’. 


Asynchronous Random Telegraph Signal 

Consider a time function that switches between two values +« and —a as shown in Figure 
8.14. The crossing times are random and we assume that it is modelled as a Poisson process 
with a rate parameter X. Then, the probability of k crossings in time r is 


Pk 


e- x ^{X\x\) k 

k\ 


(8.60) 


where X is the number of crossings per unit time. 


x(t) 



Figure 8.14 Asynchronous random telegraph signal 


If we assume that the process is in steady state, i.e. t — >• oo, then P[X(t) = a] = 
P[X(t) — —a] = 1 /2. So the mean value p x — E[x{t)\ = 0. And the product x(t)x(t + r) is 
either a 2 or —a 2 , i.e. it is a 2 if the number of crossings is even in time r and —a 2 if the number 
of crossings is odd in time r. The total probability for a 2 (i.e. an even number of crossings 
occurs) is Yl'kLo Pik, and the total probability for —a 2 is YlkLo P 2 k+\ ■ Thus, the autocorrelation 
function becomes 


Rx X <j) = E [x(t)x(t + r)] = ^ [a 2 p 2k - a 2 p 2k+ i] 
k = 0 


oo 

f(A|r|) 2 * 

(k|r|) 2 * +1 \' 

a 2 e~ m 

(—7- |t|)*" 

_k= 0 

V (2k)! 

(2k + 1)! ) 


[h k ! 


(8.61) 


which is an exponentially decaying function as shown in Figure 8.15, where the decay rate is 
controlled by the parameter X. 
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Figure 8.15 Autocorrelation function of an asynchronous random telegraph signal 


White Noise 

This is a very useful theoretical concept that has many desirable features in signal pro- 
cessing. In the previous example of the Poisson process, the autocorrelation function is 
defined as R xx ( x) — a 2 e ^ 2x ^ L Note that, as the parameter X gets larger (i.e. the number 
of crossings per unit time increases), R xx (x) becomes narrower. We may relate this to 
the concept of white noise by considering a limiting form. As X — > oo, the process is 
very erratic and R xx (x) becomes ‘spike-like’. In order that R xx ( x) does not ‘disappear 
completely’ we can allow the value a to become large in compensation. This gives an 
idea of a ‘completely erratic’ random process whose autocorrelation (autocovariance) 
function is like a delta function, and the process that has this property is called white 
noise, i.e. 

Autocorrelation function of white noise: R xx (x) — kS(x) (8.62) 

An example of the autocorrelation function of white noise is shown in MATLAB 
Example 8.10 which demonstrates some important aspects of correlation analysis related 
to system identification. Note, however, that in continuous time such processes cannot 
occur in practice, and can only be approximated. Thus, we often refer to ‘band-limited 
white noise’ whose spectral density function is constant within a band as we shall see in 
Section 8.7. 


Synchronous Random Telegraph Signal 

Consider a switching signal where now the signal can only change sign at ‘event points’ spaced 
A seconds apart. At each event point the signal may switch or not (with equal probability) as 
shown in Figure 8.16. 


x(t) 


+a 




t 


Figure 8.16 Synchronous random telegraph signal 
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Since this has equal probability, the mean value is p x = 0. We shall calculate R xx ( x) 
using time averages. For 0 < r < A, the product x(t)x(t + r) is 


x(t)x(t + r) = a 2 for a fraction 

„2 


A — T 
A~ 

1 / T ' 


of the time 


= a 1 for a fraction - ( — ) of the time 
2 VA/ 

= —a 1 for a fraction - ( — ^ of the time 
2 VA/ 


Thus, the autocorrelation function becomes 

T 


R xx ( r) = lim — J x(t)x(t + r)dt = or (l — — ) 0 < r < A 

o 

Note that for \r\ > A, the probabilities of a 2 and —a 2 are the same, so 

R x x(j) = 0 | T | > A 

As a result, the autocorrelation function is as shown in Figure 8.17. 

RJ?) 



(8.63) 


(8.64) 


(8.65) 


Figure 8.17 Autocorrelation function of a synchronous random telegraph signal 


A Simple Practical Problem M8 3 

To demonstrate an application of the autocorrelation function, consider the simple acous- 
tic problem shown in Figure 8.18. The signal at the microphone may be written as 

x(t) = as(t — AO + bs(t — A 2 ) (8.66) 

Hard reflector 
W77777777777777777777777777m 


Source, s(t ) 


Figure 8.18 A simple acoustic example 



Path (1) 
(delay, A ( ) 


Mic. x(t) 
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We assume that the source signal is broadband, i.e. R ss ( r) is narrow. By letting 
A = A 2 — Ai the autocorrelation function of the microphone signal jc(f) is 

Rxx(r) = E [x(t)x(t + r)] 

= E [(as(t — Ai) + bs(t - A 2 )) (as(t - A l + t) + bs(t - A 2 + r))] 

= (« 2 + b 2 )R ss ( t) + abR ss ( r - (A 2 - AO) + abR ss (r + (A 2 - AO) 

= (a 2 + b 2 )R ss (r) + abR ss ( r - A) + abR ss (r + A) (8.67) 

That is, it consists of the autocorrelation function of the source signal and its shifted 
versions as shown in Figure 8.19. For this particular problem, the relative time delay 
A = A 2 — Aj can be identified from the autocorrelation function of x(t), and also the 
relative distance can be found if the speed of sound is multiplied by A. 


RJr) 



Figure 8.19 Autocorrelation function for time delay problem 

We shall see later that if we also measure the source signal, then Ai can also be 
found by using the cross-correlation function. Thus, the complete transmission paths 
can be identified as long as R ss ( r) is narrow compared with the relative time delay 
A = A 2 — A]. This will be demonstrated through a MATLAB example in Chapter 9. 


The Autocorrelation (Autocovariance) Function of Non-stochastic Processes 

It is worth noting that the time average definition may be utilized with non-random (i.e. 
deterministic) functions and even for transient phenomena. In such cases we may or may not 
use the divisor T. 

1. A square wave: Consider a square periodic signal as shown in Figure 8.20. This function 
is periodic, so the autocorrelation (autocovariance) function will be periodic , and we use 
the autocorrelation function as 


Rxx(j) = 


1 

7 > 


T P 

J x(t)x(t + T )dt 
0 


(8.68) 


To form R xx ( r), we sketch x(t + r) and ‘slide it over x(t)' to form the integrand. Then, 
it can be easily verified that R xx (z) is a triangular wave as shown in Figure 8.21. 
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x(t) 


-A 


Figure 8.20 A square periodic signal 


R xx (t) 



2. A transient signal: In such a case there is no point in dividing by T, so the autocorrelation 
function for a transient signal is defined as 


(8.69) 



We note an important link with the frequency domain, i.e. if 

oo oo 

{*</*>»'■« - Kin = 

— oo — oo 

then the Fourier transform of R xx ( r) is 

OO 

F{R xx (r)}= f R xx (T)e- j2 ^dr 

— OO 

OO OO 

= J j x(t)x(t + x)e^F n f T drdt (let t\ = t + r) 

— oo — OO 

oo oo 

= J x{t{)e^ il7l!h dt x J x(t)e j2nft dt 

— oo — oo 

= = \X{f)\ 2 


(8.70) 
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Thus, the following relationship holds: 


\x{f)\ 2 = J 

— c 

OO 

^ R xx (j)e~ ,2n f T dr 
>0 


R xx ( r)= J 

— O 

X) 

f \X(f)\ 2 e j2 * f *df 

O 


(8.71) 


(8.72) 


i.e. the ‘energy spectral density’ and the autocorrelation function are Fourier pairs. This 
will be discussed further in Section 8.7. 


The Cross-correlation ( Cross-covariance ) Function 


Two Harmonic Signals MS 4 

Consider the two functions 

x(t) — A sin(<yf + 9 X ) + B 
y(t) — C sin(&>f + 9 y ) + D sin (neat + <j>) 
We form the cross-correlation function using the time average, i.e. 


T 

Rxy(?) = lim — / x(t)y(t + r )dt 
T^oo T J 
o 

1 r i 

= -AC cos [cur - (9 X - 9 y )] (8.74) 

and compare this with the autocorrelation functions which are given as 

A 2 

R xx ( t) = cos(cur) + B 2 

, , (8-75) 

C 2 D 2 

R yy (z) = cos(cur) + cos(ncnr) 

Note that the cross-correlation function finds the components in y(t) that match or fit 
jt(f). More importantly, the cross-correlation preserves the relative phase (6 X — 9 y ), i.e. it 
detects the delay that is associated with the ‘correlated (in a linear manner)’ components 
of x(t) and y(t). 

Once again an intuitive idea of what a cross-correlation reveals arises from the 
visualization of the product x(t)y(t + r) as looking at the match between x(t) and the 
shifted version of y(t). In the above example the oscillation with frequency in in y(t) 
matches that in x(t), but the harmonic not does not. So the cross-correlation reveals this 
match and also the phase shift (delay) between these components. 
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A Signal Buried in Noise M8 5 

Consider a signal buried in noise, i.e. y(t) = s(t) + n(t), as shown in Figure 8.22. 



Figure 8.22 A sinusoidal signal buried in noise 


We assume that the noise and signal are uncorrelated: for example, s(t) is a sine wave 
and n(t) is wideband noise. Then, the cross-correlation function of the signal s(t) and noise 
n(t) is R s „(t) = E[s(t)n(t + r)] = i.e. C sn ( r) = E[(s(0 - /x s )(n(f + r) - /i„)] = 0. 
Note that the cross-covariance function of two uncorrelated signals is zero for all r. Thus, the 
autocorrelation function of v(f) becomes 

R yy ( r) = E [0(f) + n{t)) 0(f + t) + n(t + r))] 

= E [s(t)s(t + t)] + E [n(t)n(t + r)] + 2 fi s fi n (8.76) 

Assuming that the mean values are zero, this is 

Ryy(T) = R ss {z) + R nn (z) (8.77) 

Since the autocorrelation function of the noise f?„„(r) decays very rapidly (see Equation 
(8.62)), the autocorrelation function of the signal R ss ( t) will dominate for larger values of r, 
as shown in Figure 8.23. This demonstrates a method of identifying sinusoidal components 
embedded in noise. 



Figure 8.23 Autocorrelation function of a sinusoidal signal buried in noise 


Time Delay Problem MS 6 

Consider a wheeled vehicle moving over rough terrain as shown in Figure 8.24. Let the time 
function (profile) experienced by the leading wheel be x(f) and that by the trailing wheel be 
y(t). Also let the autocorrelation function of x(f) be R xx ( r). We now investigate the properties 
of the cross-correlation function R xy (z). 

Assume that the vehicle moves at a constant speed V. Then, y(t) = x(f — A) where 
A = L/ V. So the cross-correlation function is 


R xy {i) = E[x(t)y(t + r)] = E[x{t)x{t + r - A)] 
= R xx ( t - A) 


(8.78) 
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Figure 8.24 A wheeled vehicle moving over rough terrain 



Figure 8.25 Autocorrelation and cross-correlation functions for time delay problem 


That is, the cross-correlation function R xy ( r) becomes a delayed version of R xx ( r) as shown 
in Figure 8.25. The cross-correlation function detects the time delay between the two signals. 

The detection of time delay using the cross-correlation function has been applied to many 
problems, e.g. radar systems, acoustic source localization, mechanical fault detection, pipe 
leakage detection, earthquake location, etc. The basic concept of using the cross-correlation 
function for a simplified radar system is demonstrated in MATLAB Example 8.7. 


8.7 SPECTRA 

So far we have discussed stochastic processes in the time domain. We now consider 
frequency domain descriptors. In Part I. Fourier methods were applied to deterministic 
phenomena, e.g. periodic and transient signals. We shall now consider Fourier methods 
for stationary random processes. 

Consider a truncated sample function xj (t) of a random process x(f) as shown in 
Figure 8.26, i.e. 

xj-(f) = x(f) \t\ < T /2 

(8.79) 

= 0 otherwise 



Figure 8.26 A truncated sample function of a stochastic process 
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We shall consider the decomposition of the power of this sample function in the 
frequency domain. As seen from the hgure, the truncated signal x T (t) is pulse-like, i.e. it 
can be regarded as a transient signal. Thus, it can be represented by the Fourier integral 

OO 

*r(0= f *T (f)e i2nf 'df (8.80) 

— OO 


Since the total energy of the signal Xj(t)dt tends to infinity as T gets large, we shall 
consider the average power of the signal, i.e. 


OO 



— OO 


Then, by Parseval’s theorem it can be shown that 


oo T/2 oo oo 

' T f x\{t)dt = J J x 2 r (t)dt = i j \X T (f)\ 2 df = J i \X T (f)\ 2 df (8.81) 

— oo —T/2 — oo — oo 

where the quantity \Xr(f)\ 2 /T is called the raw (or sample) power spectral density, 
which is denoted as 

$„(/) = y \XAf )\ 2 (8.82) 

Note that the power of the signal in a data segment of length T is 

OO OO 

' r J X 2 (t)dt = J S xx (f)df (8.83) 

— OO — OO 

Now, as T — > oo Equation (8.81) can be written as 

r/2 oo 

i r , r \x T (f )\ 2 

lim — / Xj(t)dt = / lim df (8.84) 

r-»oo T J J t^-oo T 

-T/2 -OO 

Note that the left hand side of the equation is the average power of the sample function, 
thus it may be tempting to define lim r _ > . (x) |.X r (/)| 2 /r as the power spectral density. 
However we shall see (later) that S xx (f ) does not converge (in a statistical sense) as 
T — > oo, which is the reason that the term ‘raw’ is used. In Chapter 10, we shall see 
that S xx (f) evaluated from a larger data length is just as erratic as for the shorter data 
length, i.e. the estimate S xx (f) cannot be improved simply by using more data (even 
for T -*■ oo). We shall also see later (in Chapter 10) that the standard deviation of the 
estimate is as great as the quantity being estimated! That is, it is independent of data 
length and equal to the true spectral density as follows: 


Var ($„(/)) = S 2 x (f) 


( Var (S xx (f)) 

I 01 * ^ 



(8.85) 
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In fact, we have now come across an estimate for which ergodicity does not hold, i.e. 
S xx (f) is not ergodic. So some method of reducing the variability is required. 

We do this by averaging the raw spectral density to remove the erratic behaviour. 
Consider the following average 


lim — 

r-*oo T 


T/2 

ff 

-T/2 


Xj(t)dt 


= E 


OO 

/ 


lim 

Ts-ao 


I x T (f)V 


df 


(8.86) 


Assuming zero mean values, the left hand side of Equation (8.86) is the variance of the 
process, thus it can be written as 


OC 


Var (*(/)) = tr;= / S xx (f)df 


(8.87) 


where 


S xx (f)= lim 

T^oo 


E[\X T (f)\ 2 ] 

T 


( 8 . 88 ) 


This function is called the power spectral density function of the process, and it states 
that the average power of the process (the variance) is decomposed in the frequency do- 
main through the function S xx (f), which has a clear physical interpretation. Furthermore 
there is a direct relationship with the autocorrelation function such that 


OO 

s xx (f)= f R xx (T)e-W'dr 

— OO 


(8.89) 


OO 

*»(*)= f s xx (f)e j2nfr df 

— OO 

These relations are sometimes called the Wiener-Khinchin theorem. 


(8.90) 


Note that, if a> is used, the equivalent result is 


OO 

S xx (a>) = J R xx (x)e- imz dx 

— OO 


(8.91) 


R xx (x) = 


OO 



S xx (a))e^ mr dco 


(8.92) 


Similar to the Fourier transform pair, the location of the factor 2n may be interchanged 
or replaced with \/s/2n for symmetrical form. The proof of the above Fourier pair 
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(Equations (8.89) — (8.92)) needs some elements discussed in Chapter 10, so this will be justified 
later. 

Note that the function S xx (f) is an even function of frequency and is sometimes called 
the two-sided power spectral density. If x(t) is in volts, S xx (f ) has units of volts 2 /Hz. Often a 
one-sided power spectral density is defined as 

G xx (f) = 2S xx (f) f> 0 

= S xx (f) / = 0 (8.93) 

= 0 /< o 

Examples of Power Spectral Density Functions 1 

(a) If R xx { r) = k&( r), k > 0, i.e. white noise, then 

oo oo 

s xx (f) = J R xx ( r)e- J2nfz d r = J H(r)e^ 2,r/r rft = ke- j2lzf0 = k (8.94) 


R«(J) 
k , , 


S»(/) 

k 


f 


Figure 8.27 Power spectral density of white noise 


Note that a ‘narrow’ autocorrelation function results in a broadband spectrum (Figure 8.27). 
(b) If R xx ( t) = rr 2 e“'- |r| , A > 0, then 


OO OO 

S«(/)= f Rxx(T)e- J2nfz dT = f 




2 Act, 2 


A 2 + (2tt/) 2 


(8.95) 


RJj) 



SJf) 



Figure 8.28 Exponentially decaying autocorrelation and corresponding power spectral density 

The exponentially decaying autocorrelation function results in a mainly low-frequency 
power spectral density function (Figure 8.28). 


See examples in Section 4.3 and compare. 
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(c) If R xx (t) = (A 2 /2) cos(27r/or), then 


OC 

/ 


S„(f)= I R xx (r) e - j2jTfz dr = 


OO 

/ 2 ( 


j2nhz\ -jin fz 


+ e- i2nhz ) 


dr 


A 2 

T 


j + e -t 2 -(/+/o)^)r/ T = ^-«(/ - / 0 ) + ^-i(/ + / 0 ) 

-OO 

(8.96) 



•W) 


4 


Figure 8.29 Sinusoidal autocorrelation and corresponding power spectral density 

An oscillatory autocorrelation function corresponds to spikes in the power spectral density 
function (Figure 8.29). 


(d) Band-limited white noise : If the power spectral density function is 

S xx (f) =a -B < f < B 
= 0 otherwise 

then the corresponding autocorrelation function (shown in Figure 8.30) is 

oo B 

R xx ( T)= J S xx {f)e j2lz f z df = J ae j2 * fz df = 2a B Sm ^ 2 J B B [ r) 

— oo —B 


(8.97) 


(8.98) 





«„(/) 


a | 

i 

i 

i 

i 

1 

i 

i 

i 

i 

-B 

B 


Figure 8.30 Autocorrelation and power spectral density of band-limited white noise 
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The Cross-spectral Density Function 


Generalizing the Wiener-Khinchin theorem, the cross-spectral density function is 
‘defined’ as 

OO 

S xy (f)= f R X y(T)e j2l,fz dr (8.99) 

— OO 

with inverse 

OO 

R xy (r)= J S xy {f)e i2lzfz df (8.100) 

— OO 

As with the power spectral density function, if a> is used in place of /, then 


S xy ((o) = J R xy (r)e }<oz dr 


( 8 . 101 ) 


OO 

i / 


(8.102) 


Alternatively, S xy (f) is defined as 


S X y(f) = lim 


E[X T (f)Y T (f)] 


(8.103) 


where X T (f) and Y T {f ) are Fourier transforms of truncated functions xj (?) and }’-,■(? ) 
defined for 1 1 \ < T /2 (see Figure 8.26). 

The equivalence of Equations (8.99) and (8. 103) may be justified in the same manner 
as for the power spectral density functions as discussed in Chapter 10. 

In general, the cross-spectral density function is complex valued, i.e. 

S X y{f)= \S xy (f)\ e j^ s *yW (8.104) 


This can be interpreted as the frequency domain equivalent of the cross-correlation func- 
tion. That is, \S xy (f)\ is the cross-amplitude spectrum and it shows whether frequency 
components in one signal are ‘associated’ with large or small amplitude at the same 
frequency in the other signal, i.e. it is the measure of association of amplitude in x and 
y at frequency /; arg S xy (f ) is the phase spectrum and this shows whether frequency 
components in one signal Tag' or ‘lead’ the components at the same frequency in the 
other signal, i.e. it shows lags/leads (or phase difference) between x and y at frequency /. 


Properties of the Cross-spectral Density Function 
(a) An important property is 

S X y(f) = S yX (f) 


(8.105) 


This can be easily proved using the fact that R xy ( r) = R yx (— r). 
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(b) A one-sided cross-spectral density function G xy {f) is defined as 

G xy (f) = 2S xy (f) f> 0 

= S xy (f) / = 0 (8.106) 

= 0 / <0 

(c) The coincident spectral density (co-spectrum) and quadrature spectral density (quad- 
spectrum) are defined as (Bendat and Piersol, 2000) 

OO 

G xy (f) = 2 J R xy (T)e- j2 * fz dT =C xy (f)-jQ xy (f) /> 0 (8.107) 

— OO 

where C xy (f ) and Q xy (f) are called the co-spectra and quad-spectra, respectively (the 
reason for these names is explained later in the example given in page 249). C xy (f) is an 
even function of / and Q xy (f) is an odd function of /. Also, by writing R xy ( r) as the 
sum of even and odd parts (see Equation (3.18)), then 

OO 

c xy (f) = 2 1 [R xy (z) + /^(r)] cos(27r/T)r/T = C xy (-f) (8.108) 

0 

oo 

Qxy(f ) = 2 j [R xy (x) - R y A t)] sin(2^/r)dr = - Q xy (-f ) (8.109) 

o 

Similar to S xy (f), the one-sided cross-spectral density function G xy (f) can be 
written as 


G xy (f) = |G ly (/)| (8.110) 

Then, it can be shown that 

\G xy (f)\ = JclAf) + Ql y (f) (8.111) 

and 

arg G xy (J) = - tan- 1 (8- 112 ) 


(d) For the phase of S xy (f) = S xy ( f )\ $»(/), let 6 x (f) and 0 y (f) be the phase com- 

ponents at frequency / corresponding to x{t) and y(t), respectively. Then, arg S xy (f) 
gives the phase difference such that 

arg S xy (f) = -[dAf)-e y (f)\ (8.113) 
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which is 9 y (f) — 9 x (f). Notethatitis not9 x (f) — 9 y {f) (see Equation (8.103) where 
X T (f) is multiplied by Y T (f ), and also compare it with Equations (8.73) and (8.74)). 
Thus, in some texts, S xy (f) is defined as 

S X j(f)=\Sxj(f)\e- ,W (8.H4) 

where 9 xy (f ) = 9 x (f) — 9 y (f). So, care must be taken with the definitions. 


(e) A useful inequality satisfied by S xy (f) is 


or 


S,,(/) | 2 £ S XX (f)S„(f) 


(8.115) 


G xy (f ) | 2 5 G xx (f)G„(f) 


(8.116) 


The proof of this result is given in Appendix B. We shall see later that this is a par- 
ticularly useful result - we shall define, in Chapter 9, the ‘coherence function’ as 
| S xy (f ) | /(S xx (f)S yy (f)) which is a normalized cross-spectral density function. 


Examples of Cross-spectral Density Functions 

Two examples are as follows: 

(a) Consider two functions (see also Equation (8.73) in Section 8.6) M8 ' 8 


x(t) = A svcflnpt + 9 X ) 

y(t) = C sin(27rpf + 9 y ) + D sm(n2npt + 0) 

In the previous section, it was shown that the cross-correlation function is 


R xy ( t) = -AC cos [2npr — ( 9 X — 9 y j\ 


Let 9 xy = 9 X — 9 y ; then 


R x y {x) = -AC cos(2n pt — 9 xy ) 


(8.117) 


(8.118) 


(8.119) 


The cross-spectral density function is 

oo oo 

S xy (f) = J R xy (T)e- j2 * fx dz = ^ J ^(arpr-ft») + e -j(2r/,T-fl„)^ c -;2»/r dr 


AC 

AC 


OO 


-j2n(f-p)r e ~j6 xy _|_ e ~j2n(f+p)r g je z 


)dz 


= -j- [«(/ - p)e- }6 " + S(f + p)e^y] 


(8.120) 
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and the one-sided cross-spectral density function is 

G xy (f) = - P )e-^y 


( 8 . 121 ) 


Thus, it can be seen that 


G X y(f)\ = —Kf~P) 


Amplitude association 


( 8 . 122 ) 


and 


arg G xy (f) = -9 xy = ~(9 X - 9 y ) (8.123) 

Phase difference 

Front G xy (f) in Equation (8.121), we see that the co-spectra and quad-spectra are 

C X y(f) = ^8(f - p) cos 9 xy (8.124) 

Qxy(f) = ^-8{f - p ) sin e xy (8. 125) 

Since x(t) = A sm(2npt + 0 X ) = A &in(2npt + 9 y + 9 xy ), Equation (8.117) can be writ- 
ten as 


x(t) = A sin(2jrpf + 9 y ) cos 9 xy + A cos(2npt + 9 y ) sin 9 xy 

(8.126) 

y(t) = C sin(27 rpt + 9 y ) + D s'm(n27tpt + <j>) 

Comparing Equation (8.124) and (8.126), we see that C xy (f) measures the correlation 
of the in-phase components, i.e. between A sin(2jr pt + 9 y ) and C sin(2 jzpt + 9 y ), thus 
it is called the coincident spectrum. Similarly, Q xy (f) measures the correlation between 
sine and cosine components (A cos(27r pt + 9 y ) and C sin(2j rpt + 9 y )), i.e. quadrature 
components, thus it is called the quadrature spectrum. 

(b) Consider the wheeled vehicle example shown previously, in Figure 8.24 shown again 
here M8 ' 9 



Figure 8.24 A wheeled vehicle moving over rough terrain 
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We have seen that R xy ( r) = R xx ( x — A), where A = L/ V. So the cross-spectral density 
function is 


S xy (f) = J R xx (j ~ A)e ]2nfz dr = J R xx (u)e ]2,zf(u+A) du 

— oo — oo 

oo 

= e -J' 2lr / A j R xx (u)e'i 2lzfu du 

— OO 

= e~ J2 * fA S xx (f) (8.127) 

This shows that the frequency component / in the signal y(t) lags that component in 
x(t) by phase angle 2jr/A. This is obvious from simple considerations: for example, if 
x(t) = A cos (cot) then y{t) = A cos [a>{t — A)] = A cos {cot — co A), i.e. the lag angle is 
coA = 2nf A. 


Comments on the Time Delay Problem 

At this point, it may be worth relating the time delay problem to the pure delay discussed 
in Chapter 4, where we defined the group delay as t s = —d<t>(a>)/dco. We saw that a 
pure delay (say, delay time is A for all frequencies) produces a constant group delay, i.e. 

tg — A- 

The above time delay problem can be considered as identifying a pure delay system. 
To see this more clearly, rewrite Equation (8.127) as 

S xy {f) = (8.128) 

where H{f) is the frequency response function which can be written as 

//(/)= §4 « -WA (8.129) 

t> xx (J ) 

i.e. H(f) is a pure delay system. Note that we are identifying the system by performing 
the ratio S xy (f)/ S xx (f). We shall compare the results of using arg S xy (f) and arg H(f ) 
in MATLAB Example 8.9. We shall also demonstrate a simple system identification 
problem by performing H(f) = S xy (f)/S xx (f) in MATLAB Example 8. 10. More details 
of system identification will be discussed in the next chapter. 


8.8 BRIEF SUMMARY 

1 . A stochastic process X ( t ) is stationary if it satisfies two conditions: (i) p(x , t) — p(x) 
and (ii) p{xi , t \ ; x 2 , t 2 ) = p(x i , 0 + T; x 2 , t 2 + T). 

2. If certain averages of the process are ergodic, then ensemble averages can be replaced 
by time averages. 
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3. Autocovariance and autocorrelation functions are defined by 

C xx ( t) = E[(X(t) - p, x )(X(t + r) - p x )] 

R xx ( T) = E[X(t)X(t + T)] 

where C xx ( r) = R xx ( r) — p? x . The corresponding time averages are 


M' 


C xx ( t) = lim — / (x(t) - li x )(x(t + r) - p x )dt 

T-+co T 




R x a(t) = lim — / x(t)x(t + r )dt 

T—>oo 1 


4. Cross-covariance and cross-correlation functions are defined by 

C X y (t) = E[(X(t) - p x )(Y(t + r) - fly)] 

R x y(T) = E[X(t)Y(t + r)] 

where C xy ( z) = R xy { r) — pL x pL y - The corresponding time averages are 


fj 


C xy ( r) = lim — / (x(t) - p x )(y(t + r) - n y )dt 
r— ►oo 1 


l tJ 

o 


Rxyi?) = lim — / x(t)y(t + r )dt 


T—> oo T 

5. An unbiased estimate of the cross-covariance function is 

T-x 


c xy { t ) = 


-r^f 

0 


(x(t) — x)(y(t + t) — y)dt 0 < r < T 


where C xy (—r) — C yx ( r). The corresponding digital form is 

N-m-l 

N — m 


j N —m — I 

C xy {mA) = (r(nA) — x)(y {{n + in) A) — y) 0 < m < N — 1 

AT — m £ — * 


n=0 


6. The autocorrelation functions of a periodic signal and a transient signal are, 
respectively, 


RxxiC) = y~ / x{t)x{t + x)dt and R xx (t) = / x(t)x(t + x)dt 


OO 

/ 
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7. The autocorrelation function of white noise is 

RxxM = kS(z) 

8. The cross-correlation function of two uncorrelated signals s(t) and n{t) is 

R sll ( t) — E [s{t)n(t + r)] = 0 (assuming zero mean values) 

9. The power spectral density and cross-spectral density functions are 


E\\X T (f)\ 1 2 ] E 

S xx (f) - lim 1 J J and S xy (f)= lim - 

T— >oo 1 T—> oo 


[x*(f)YT(f)\ 


and the corresponding raw (or sample) spectral density functions are 


1 


1 


Sxx(f) = j l*r(/)| 2 and S xy (f)=^X*(f)Y T (f) 


10. The Wiener-Khinchin theorem is 


OO 

/ 


Sxx(f) = / Rxx(r)e 




OO 

/ 


dr and R xx (r) = I S xx (f)e j27rfr df 


Also, 


OO OO 

Sxy(f)= f Rxy(r)e jlnfT dz and R xy ( r) = f S xy (f)e P* f \ 


df 


11. The cross-spectral density function is complex valued, i.e. 

Sxy(f)=\S X y(f)\e jalsS » (f) and S xy (f)=S yx (f) 

where 1 S xy (f)\ is the measure of association of amplitude in x and y at frequency /, 
andarg S xy {f) shows lags/leads (or phase difference) between x and y at frequency /. 


8.9 MATLAB EXAMPLES 


Example 8.1: Probability density function of a sine wave 


The theoretical probability density function of a sine wave x(f) = A sin (cot + 9) is 


P(x) = 


1 

7 r-J A 2 — x 2 


In this MATLAB example, we compare the histograms resulting from the ensemble 
average and the time average. For the ensemble average 9 is a random variable and t is 
fixed, and for the time average 9 is fixed and t is a time variable. 
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Line 

MATLAB code 

Comments 

i 

clear all 

Define the amplitude and frequency of a sine wave. 

2 

A=2; w=l; t=0; 

For the ensemble average, let time t = 0. 

3 

rand('state',l); 

Initialize the random number generator, and then 

4 

theta=rand(l,20000)*2*pi; 

generate 6 which is uniformly distributed on the 

5 

xl =A*sin(w*t+theta); 

range 0 to 27T . The number of elements of 6 is 
20000. 

Also generate a sequence xl which can be 
considered as an ensemble (only for the specified 
time, T = 0). 

6 

nbin=20; Nl=length(xl); 

Define the number of bins for the histogram. Then 

7 

[nl sl]=hist(xl,nbin); 

calculate the frequency counts and bin locations. 

8 

figure( 1 ) % Ensemble average 

Plot the histogram of xl. Note that it has a U shape 

9 

bar(sl, nl/Nl) 

as expected. One may change the number of 

10 

xlabel('\itx\rm_l ') 

elements of 6 , and compare the results. 

11 

ylabel('Relative frequency') 


12 

t=0:0.01 :(2*pi)/w-0.0 1 ; 

For the time average, 6 is set to zero and a sine 

13 

x2=A*sin(w*t); 

wave (x2) is generated for one period. 

14 

[n2 s2]=hist(x2, nbin); 

Calculate the frequency counts and bin locations for 

15 

N2=length(x2); 

x2. 

16 

figure(2) % Time average 

Plot the histogram of x2. Compare the result with 

17 

bar(s2, n2/N2) 

the ensemble average. 

18 

xlabel('\itx\rm_2') 


19 

ylabel('Relative frequency') 



Results 




(b) Time average 


Comments: The two results are very similar and confirm the theoretical probability 
density function. This illustrates that the process is ergodic with respect to the estimation 
of the probability density function. 
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Example 8.2: Autocorrelation function of a sine wave 

We compare the autocorrelation functions of a sinusoidal signal x(t) = A sinfcur + 9), 
resulting from the ensemble average and the time average. The theoretical autocorrelation 
function is 

A 2 

Rxxi r) = — cosfiwr) 

For the ensemble average 9 is a random variable and t is fixed, and for the time average 
9 is fixed and t is a time variable. 


Line MATLAB code Comments 


1 clear all 

2 A=2; w=2*pi*l; t=0; fs=100; 

3 rand('state',l); 

4 theta=rand( 1 , 5 000) * 2 *pi ; 

5 xl=A*sin(w*t+theta); 


6 Rxxl=[]; maxlags=5; 

7 for tau=-maxlags: l/fs:maxlags; 

8 tmp=A*sin(w*(t+tau)+theta); 

9 tmp=mean(xl.*tmp); 

10 Rxx 1= [Rxxl tmp]; 

1 1 end 

12 tau=-maxlags:l/fs:maxlags; 

13 Rxx=A~2/2*cos(w*tau); 

14 figure(l) % Ensemble average 

15 plot(tau,Rxxl,tau,Rxx, 'r:') 

16 xlabel('Lag (\it\tau)') 

17 ylabel(' Autocorrelation') 

18 t=0:l/fs:20-l/fs; 

19 x2=A*sin(w*t); 

20 [Rxx2, tau2]=xcorr(x2,x2,maxlags*fs, 
'unbiased'); 

21 tau2=tau2/fs; 


22 figure(2) % Time average 

23 plot(tau2,Rxx2,tau,Rxx, 'r:') 

24 xlabel('Lag (\it\tau)') 

25 ylabel('Autocorrelation') 


Define the amplitude and frequency of a sine 
wave. For the ensemble average, let time t = 0. 
Also define the sampling rate. 

Initialize the random number generator, and then 
generate 6 which is uniformly distributed on the 
range 0 to 27T . The number of elements of 6 is 
5000. Then generate a sequence xl which can be 
considered as an ensemble (only for the 
specified time, t = 0). 

Define an empty matrix (Rxxl) which is used in 
the ‘for’ loop, and define the maximum lag 
(5 seconds) for the calculation of the 
autocorrelation function. 

The ‘for’ loop calculates the autocorrelation 
function Rxxl based on the ensemble average. 
The variable ‘tau’ is the lag in seconds (Line 12). 

Calculate the theoretical autocorrelation 
function Rxx. This is used for comparison. 

Plot the autocorrelation function Rxx 1 obtained 
by ensemble average (solid line), and compare 
this with the theoretical autocorrelation function 
Rxx (dashed line). 

For the time average, 6 is set to zero and a sine 
wave (x2) is generated for 20 seconds. The 
MATLAB function ‘xcorr(y,x)’ estimates the 
cross-correlation function between x and y, i.e. 
R xy ( t) (note that it is not R yx ( r)). In this 
MATLAB code, the number of maximum lag 
(maxlags*fs) is also specified, and the unbiased 
estimator is used. 

The variable ‘tau2’ is the lag in seconds 
(Line 21). 

Plot the autocorrelation function Rxx2 obtained 
by time average (solid line), and compare this 
with the theoretical autocorrelation function Rxx 
(dashed line). 
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Results 



Lag (r) 

(a) Ensemble average 



Lag (r) 


(b) Time average 


Comments: The two results are almost identical and very close to the theoretical au- 
tocorrelation function. This demonstrates that the process is ergodic with respect to the 
estimation of the autocorrelation function. 


Example 8.3: Autocorrelation function of an echoed signal 

Consider the following echoed signal (see Equation (8.66) and Figure 8.18): 

x(t) = Si(t) + s 2 (t) = as(t - AO + bs(t - A 2 ) 

In this example, we use a sine burst signal as the source signal, and demonstrate that the 
autocorrelation function R xx ( r) detects the relative time delay A = A 2 — Ai. 

We shall also consider the case that some additive noise is present in the signal. 


Line MATLAB code 


Comments 


1 clear all 

2 a=2; b=l; fs=200; 
deltal = l; delta2=2.5; 

3 t=0:l/fs:0.5-l/fs; 

4 s=sin(2*pi*10*t); 

5 N=4*fs; 

6 sl=[zeros(l,deltal*fs) a*s]; 
sl=[sl zeros(l,N-length(sl))]; 

7 s2=[zeros(l,delta2*fs) b*s]; 
s2=[s2 zeros(l,N-length(s2))]; 

8 x = sl+s2; 


Define the parameters of the above equation. The 
sampling rate is chosen as 200 Hz. Note that the 
relative time delay is 1.5 seconds. 

Define the time variable up to 0.5 seconds, and 
generate the 10 Hz sine burst signal. 

Generate signals S\(t) = as (t — A i) and 

$2(0 = bs(t — A 2 ) up to 4 seconds. Then combine 

these to make the signal x(t). 


9 % randn('state',0); This is for later use. Uncomment these lines then. 

10 % noise = l*std(s)*randn(size(x)); Initialize the random number generator, then 

11 % x=x+noise; generate the Gaussian white noise whose variance 

is the same as the source signal s(t), i.e. the 
signal-to-noise ratio is 0 dB. 
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12 L=length(x); t=[0:L-l]/fs; Define the time variable again according to the 

13 maxlags=2.5*fs; length of the signal x(t). 

14 [Rxx, tau]=xcorr(x,x,maxlags); The maximum lag of the autocorrelation function 

15 tau=tau/fs; is set to 2.5 seconds. The variable ‘tau’ is the lag 

in seconds. 

Note that the autocorrelation function is not 
normalized in this case because the signal is 
transient. 


16 

figure(l) 

Plot the signal x(t). 

17 

plot(t,x) 

Later, compare this with the noisy signal. 

18 

xlabel('Time (s)'); 
ylabel('\itx\rm(\itt\rm)') 


19 

axis([0 4 -4 4]) 


20 

figure(2) 

Plot the autocorrelation function R xx (z). 

21 

plot(tau,Rxx) 

Note its symmetric structure, and the peak values 

22 

xlabel('Lag (\it\tau)'); 

occur at R xx ( 0), R xx ( A) and R xx (- A). 


ylabel(' Autocorrelation') 

Run this MATLAB program again for the noisy 

23 

axis([-2.5 2.5 -300 300]) 

signal (uncomment Lines 9-11, and compare 
R xx (z) with the corresponding time signal). 


Results 



Time (s) 

(al) Clean time signal 

4 T T 1 


3 



0.5 1 1.5 2 2.5 3 3.5 4 

Time (s) 

(bl) Noisy signal 



Lag (r) 

(a2) Autocorrelation function of the clean signal 



(b2) Autocorrelation function of the noisy signal 
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Comments: Comparing Figures (bl) and (b2), it can be seen that the autocorrelation 
function is much cleaner than the corresponding time signal, and detects the relative time 
delay even if a significant amount of noise is present. This is because the source signal 
and the noise are not correlated, and the white noise contributes to R xx ( r) at zero lag only 
(in theory). This noise reduction will be demonstrated again in MATLAB Example 8.5. 


Example 8.4: Cross-correlation function 

Consider two signals (see Equation (8.73)) 

x(t) = A sin(cuf + 9 X ) + B 

y(t) = C sin(a)? + 9 y ) + D sin(no>f + ij>) 

The cross-correlation function is R xy ( r) = \AC cos[cur — ( 6 X — 9 y )] (see Equation 
(8.74)). 


Line 

MATLAB code 

Comments 

i 

clear all 

Define the parameters and time variable of the 

2 

A=1;B=1;C=2; D=2; 

above equation. The sampling rate is chosen as 


thetax=0; thetay=-pi/4; 

200 Hz. 


phi=pi/2; n=2; 

Calculate the relative time delay for reference. Note 

3 

w=2*pi*l; fs=200; T=100; 

that the relative phase is 9 X — 9 y = tt/4 that 


t=0:l/fs:T-l/fs; 

corresponds to the time delay of 0.125 seconds. 

4 

rel_time_delay=(thetax-thetay)/w 

Generate signals x(r) and y(t ) accordingly. 

5 

x=A*sin(w*t+thetax)+B; 


6 

y=C*sin(w*t+thetay) 

+D*sin(n*w*t+phi); 


7 

maxlag=4*fs; 

The maximum lag of the cross-correlation function 

8 

[Rxy, tau]=xcorr(y,x,maxlag, 

is set to 4 seconds. The unbiased estimator is used 


'unbiased'); 

for the calculation of the cross-correlation function. 

9 

tau=tau/fs; 


10 

figure(l) 

Plot the signals x(t) and y(t). 

11 

plot(t( 1 :maxlag),x( 1 imaxlag), 
t(l:maxlag),y(l:maxlag), 'r') 


12 

xlabel('Time (s)'); 
ylabel('\itx\rm(\itt\rm) 
and \ity\rm(\itt\rm)') 


13 

figure(2) 

Plot the cross-correlation function R xy (z). Note that 

14 

plot(tau(maxlag+l :end), 

it shows the values for positive lags only in the 


Rxy (maxlag+ Lend)); hold on 

figure. 

15 

plot( [reLtime .delay rel.time .delay ] , 

Compare the figure with the theoretical 


[-1.5 1.5], 'r:') 

cross-correlation function (i.e. Equation (8.74)). 

16 

hold off 


17 

xlabel('Lag (\it\tau)'); 
ylabel('Cross-correlation') 
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Results 




Comments: The cross-correlation function finds the components in y(t) that are corre- 
lated with the components in x(t) and preserves the relative phase (i.e. time delay). 


Example 8.5: A signal buried in noise 

Consider the signal buried in noise (see Figure 8.22) 

y(t) = s(t) + n(t) 

In this example, we use a sine wave for s(t) and a band-limited white noise for n(t ), where 
s(t) and n(t) are uncorrelated and both have zero mean values. Thus, the cross-correlation 
between s(t) and n(t) is E[s(t)n(t + r)] = E[n(t)s(t + r)] = 0. 

Then, the autocorrelation function is 

Ryy( r) = Rss(r) + R nn (r) 

It is shown that 


R yy ( r) ^ ^(r) for large r (see Figure 8.23) 

Considering the time-averaged form of correlation functions, we also compare the results 
for different values of T (total record time). 


Line MATLAB code 


Comments 


1 clear all 

2 A= 1 ; w = 2*pi* 1 ; fs=200; 

3 T=100; % T=1000; 

4 t=0: l/fs:T- 1/fs; 

5 s=A*sin(w*t); 

6 randn('state',0); 

7 n=randn(size(s)); 


Define the parameters of a sine wave and the sampling 
rate. The total record time is specified by ‘T’. Initially, 
we use T = 100 seconds (i.e. 100 periods in total). Later, 
we shall increase it to 1000 seconds. Also, define the 
time variable, and generate the 1 Hz sine wave. 

Generate the broadband white noise signal. (The 
frequency band is limited by half the sampling rate, i.e. 
zero to fs/2 Hz.) 
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8 fc=20; These lines convert the above broadband white noise 

9 [b,a]=butter(9, fc/(fs/2)); into a band-limited white noise by filtering with a digital 

10 n = filtfilt(b,a,n); low-pass filter. 

‘fc’ is the cut-off frequency of the digital filter. The 
MATLAB function ‘[b,a] = butter(9, fc/(fs/2))’ designs 
a ninth-order low-pass digital Butterworth filter (HR), 
where ‘b’ is a vector containing coefficients of a moving 
average part and ‘a’ is a vector containing coefficients of 
an auto-regressive part of the transfer function (see 
Equation (6.12)). 

The MATLAB function ‘output = filtfilt(b,a,input)’ 
performs zero-phase digital filtering. (Digital filtering 
will be briefly mentioned in Appendix H.) The resulting 
sequence ‘n’ is the band-limited (zero to 20 Hz) white 
noise. 


11 

n=sqrt(2)*(std(s)/std(n))*n; 

Make the noise power twice the signal power, i.e. ‘ Var(n) 


% SNR=-3dB 

= 2x Var(s)’. Note that the signal-to-noise ratio is —3 

12 

y=s+n; 

dB. Then, make the noisy signal ‘y’ by adding n to s. 

13 

maxlags=4*fs; 

Calculate the autocorrelation function up to 4 seconds of 

14 

[Ryy, tau]=xcorr(y,y,maxlags, 
'unbiased'); 

lag. 

15 

tau=tau/fs; 


16 

figure) 1) 

Plot the signals s(t ) and y(t ) up to 8 seconds on the same 

17 

plot(t( 1 : 8*fs),y( 1 : 8*fs), 
t(l:8*fs),s(l:8*fs), 'r:') 

figure. 

18 

xlabel('Time (s)') 


19 

ylabel('\its\rm(\itt\rm) and 
\ity\rm(\itt\rm)') 


20 

figure(2) 

Plot the autocorrelation function R vv ( z) for the positive 

21 

plot(tau(maxlags+l :end), 

lags only. 


Ryy(maxlags+l:end), 'r') 

Run this MATLAB program again for T = 1000 (change 

22 

xlabel('Lag (\it\tau)'); 
ylabel(' Autocorrelation') 

the value at Line 3), and compare the results. 

23 

axis([0 4-1.5 1.5]) 



Results 



(a) Clean time signal s(t) and noisy signal y(t) 
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Comments: Comparing Figures (b) and (c), it can be seen that as T increases the results 
improve. Note, however, that it will not be so if the signal is a transient. For example, if 
the total record time in MATLAB Example 8.3 is increased the result gets worse (Line 5, 
N = 4*fs; of MATLAB Example 8.3). This is because, after the transient signal dies out, it 
does not average ‘signal x noise’ but ‘noise x noise’. Thus, we must apply the correlation 
functions appropriately depending on the nature of the signals. 


Example 8.6: Application of the cross-correlation function (time delay problem 1) 

Consider the wheeled vehicle example given in Section 8.6 (see Figure 8.24). We assume 
that the surface profile results in a band-limited time function (or profile) s(t) and the 
trailing wheel experiences the same profile A seconds later, i.e. s(t — A). 

We measure both these signals, and include uncorrelated broadband noises n x (t) 
and n y (t), i.e. 

x(t) = s(t) + n x (t ) 
y(f) = s(f - A) + n y (t) 

The cross-correlation function R xy ( r) is (assuming zero mean values) 

R xy {r) = E[(s{t) + n x (t)) ( s(t - A + r) + n y (t + r))j 
= E [s(t)s(t + r - A)] = R ss ( x - A) 


Line MATLAB code 


Comments 


1 

2 

3 

4 

5 

6 
7 


clear all 

fs=1000; T=5; t=0:l/fs:T-l/fs; 

randn('state',0); 

s=randn(size(t)); 

fc=100; [b,a]=butter(9, fc/(fs/2)); 

s=filtfilt(b,a,s); 

s=s-mean(s); s=s/std(s); 

% Makes mean(s)=0 & std(s)=l; 


The sampling rate is 1000 Hz, and the time variable 
is defined up to 5 seconds. 

Broadband white noise is generated, and then it is 
low-pass filtered to produce a band-limited white 
noise s(t), where the cut-off frequency of the filter is 
100 Hz. 

Produce the signal s(t) such that it has zero mean 
value and the standard deviation is one. 
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8 

delta=0.2; 

Define the delay time A = 0.2 seconds, and generate 

9 

x=s(delta*fs+l:end); 

two sequences that correspond to s(t) and s(t — A). 

10 

y =s( 1 :end-delta*fs) ; 

Generate the broadband white noise n x (t) and n x (y). 

11 

randn('state',l); 

Then add these signals appropriately to make noisy 


nx= 1 *std(s)*randn(size(x)); 

measured signals x(r) and y(t). Note that the 

12 

randn('state',2); 

ny= 1 *std(s)*randn(size(y)); 

signal-to-noise ratio is 0 dB for both signals. 

13 

x=x+nx; y=y+ny; 


14 

maxlagl=0.25*fs; 

Calculate the autocorrelation function R ?s ( r) and the 


maxlag2=0.5*fs; 

cross-correlation function R xy ( r), where the 

15 

[Rss, taul]=xcorr(s,s,maxlagl, 
'unbiased'); 

unbiased estimators are used. 

16 

[Rxy, tau2]=xcorr(y,x,maxlag2, 
'unbiased'); 


17 

taul=taul/fs; tau2=tau2/fs; 


18 

figure) 1) 

Plot the autocorrelation function R ss ( r). 

19 

plot(taul,Rss) 


20 

axis([-0.25 0.25 -0.4 1.2]) 


21 

xlabel('Lag (\it\tau)') 


22 

ylabel('Autocorrelation 

(\itR_s_s\rm(\it\tau\rm))') 


23 

figure(2) 

Plot the cross-correlation function R xv (z), and 

24 

plot(tau2(maxlag2+ 1 :end), 

compare this with the autocorrelation function 


Rxy(maxlag2+1 :end)) 


25 

axis([0 0.5 -0.4 1.2]) 


26 

xlabelfLag (\it\tau)') 


27 

ylabel('Cross-coiTelation 

(\itR_x_y\rm(\it\tau\rm))') 



Results 



Lag (r) 

(a) Autocorrelation function R ss (t ) 



' 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 

Lag(r) 

(b) Cross-correlation function R^x ) 


Comments: Note that, although both signals x(t) and y(t) have a very low SNR (0 dB 
in this example), the cross-correlation function gives a clear copy of R ss ( r) at r = 0.2 
seconds. 
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Example 8. 7: Application of the cross-correlation function (time delay problem 2) 

This example describes the basic concept of using the cross-correlation function in a 
(radar-like) system. It is similar to the previous Example (MATLAB Example 8.6), 
except that we shall use a pulse-like signal for this example. 

Let x(t ) be a pulse transmitted by the radar system, and y(t) be the received signal 
that contains a reflected pulse from a target such that 

y(t) = ax{t — A) + n(t) 

where n(t) is uncorrelated broadband noise. Note that the amplitude of the reflected 
pulse in y(t) may be very small compared with the original pulse, and so the SNR of the 
received signal y(f) will also be very low. 

To maximize the detectability, a special filter called a ‘matched filter’ is usually 
used. The matched filter is known to be an optimal detector while maximizing the SNR 
of a signal that is buried in noise (Papoulis, 1977; Bencroft, 2002). 

If the length of the pulse x(t) is T seconds, the impulse response function of the 
matched filter is defined by 


h(t) = x{T — t) 


i.e. the pulse x{t) is time reversed and shifted. Now, the received signal y{t) is filtered, 
i.e. y(f) is an input to the matched filter as shown in Figure (a). 


Input 


y{t) 


h(t) =x(T—t) 


out(t ) 


Output? 


(a) Matched filtering 


The output signal is the convolution of y(t) and x(T — t). Thus, it follows that 


out(t) = 


OO 

J y(r)h(t — r)dr = 


J y(*)x(T -{t- r))dr 


OO 

J y(r)x(T — t + t )dt 

— OO 


OO 

J y( t)x(t + (T - t))dr = R yx (T - t) = R xy (t - T) 

— OO 


Note that the result is the cross-correlation between the original pulse x(t) and the received 
signal y(t ), which is shifted by the length of the filter T . 

Assuming zero mean values, the cross-correlation function /? Ay (r) is 


R X y(*) = E MO ( ax(t - A + r) + n(t + t))] 

= ciE [x(t)x(t + r — A)] = ciR xx { r — A) 
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i.e. the cross-correlation function gives the total time delay A between the transmitter 
and the receiver. Thus, the distance to the target can be estimated by multiplying half of 
the time delay A /2 by the speed of the wave. The filtered output is 


out(t ) = R xy (t — T) = a R xx (t - T - A) = aR xx (t — (T + A)) 

We will compare the two results: the direct cross-correlation function R xy ( t) and the 
filtered output out(t). A significant amount of noise will be added to the received signal 
(SNR is — 6dB). 


Line MATLAB code Comments 


1 clear all 

2 fs=200; 

3 t=0: 1/fs: 1 ; 

4 x=chirp(t,5,l,15); 

5 h=fliplr(x); % Matched filter 


6 figure( 1 ) 

7 plot(t,x) 

8 xlabel('Time (s)') 

9 ylabel('Chirp waveform, 
\itx\rm(\itt\rm)') 

10 figure(2) 

1 1 plot(t,h) 

12 xlabel('Time (s)') 

13 ylabel('Matched filter, 
\ith\rm(\itt\rm)') 

14 delta=2; a=0.1; 

15 y=[zeros(l,delta*fs) a*x zeros(l,3*fs)]; 

16 t=[0:length(y)-l]/fs; 

17 randn('state',0); 

18 noise =2*std(a*x)*randn(size(y)); 

19 y=y+noise; 


The sampling rate is 200 Hz, and the time 
variable is defined up to 1 second. This is the 
duration of the pulse. 

For the transmitted pulse, a chirp waveform is 
used. The MATLAB function ‘chirp(t,5,l,15)’ 
generates a linear swept frequency signal at the 
time instances defined by ‘t’, where the 
instantaneous frequency at time 0 is 5 Hz and at 
time 1 second is 15 Hz. 

Then, define the matched filter h. The MATLAB 

function ‘fliplr(x)’ flips the vector x in the 

left/right direction. The result is 

h(t) = x(T — t ), where T is 1 second in this 

case. 

Plot the transmitted chirp waveform x(t). 


Plot the impulse response function of the 
matched filter h{t), and compare with the 
waveform x(r). 


Define the total time delay A = 2 seconds and 
the relative amplitude of the reflected waveform 
a = 0.1. 

Generate the received signal y(r)- We assume 
that the signal is measured for up to 6 seconds. 
Define the time variable again according to the 
signal y{t). 

Generate the white noise whose standard 
deviation is twice that of the reflected 
waveform, then add this to the received signal. 
The resulting signal has an SNR of —6 dB, i.e. 
the noise power is four times greater than the 
signal power. 
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20 

figure(3) 

Plot the noisy received signal y(t). Note 

21 

plot(t,y) 

that the reflected waveform is completely 

22 

xlabel('Time (s)') 

buried in noise, and so is not noticeable 

23 

ylabel('Received signal, 
\ity\rm(\itt\rm)') 

(see Figure (d) below). 

24 

maxlags=5*fs; 

Define the maximum lags (up to 

25 

[Rxy, tau] =xcorr(y,x,maxlags); 

5 seconds), and calculate the 

26 

tau=tau/fs; 

cross-correlation function R xy {z). Note 
that R xy (z) is not normalized. 

27 

figure(4) 

Plot the cross-correlation function R xy (z). 

28 

plot(tau(maxlags+ 1 :end), 
Rxy(maxlags+ 1 :end)) 

Note that the peak occurs at r = 2. 

29 

xlabeK'Lag (\it\tau)') 


30 

ylabel('Cross-correlation, 

\itR_x_y\rm(\it\tau\rm)') 


31 

out=conv(y,h); out=out 

Now, calculate out(t ) by performing 


(Llength(y)); 

the convolution of y(t) and h{t). 


% or out=filter(h,l,y); 

The same result can be achieved by 
‘out=filter(h,l,y)\ 

Note that ‘IT can be considered as an FIR 
(Finite Impulse Response) digital filter (or 
an MA system). Then, the elements of ‘IT 
are the coefficients of the MA part of the 
transfer function (see Equation (6.12)). In 
this case, there is no coefficient for the 
auto-regressive part, except ‘ 1 * in the 
denominator of the transfer function. 

32 

figure(5) 

Plot the filtered signal out(t), and compare 

33 

plot(t( 1 :maxlags),out(l :maxlags)) 

this with the cross-correlation function 

34 

xlabel('Time (s)') 

R xy (z). Now, the peak occurs at t = 3 and 

35 

ylabel('Filtered signal, 

the shape is exactly same as the 


\itout\rm(\itt\rm)') 

cross-correlation function, i.e. R xv (z ) is 
delayed by the length of the filter T. 


Results 




(b) Transmitted chirp waveform, x (t) (c) The matched filter, h(t) 
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Lag (r) 


(e) Cross-correlation function R^t) 



(f) Output signal of the matched filter, out(t) 


Comments: Note that Figure (f) is simply a delayed version of Figure (e). This example 
demonstrates that the cross-correlation function maximizes the SNR of a signal that is 
buried in noise. 


Example 8.8: Cross-spectral density function (compare with MATLAB Example 8.4) 

Consider two signals (see Equation (8.117)) 

x(t) = A s'm(27tpt + 9 X ) 

y(t) = C sin(2jrp? + 9 y ) + D sin(n2npt + <j>) 

These are the same as in MATLAB Example 8.4 except that the constant B is not included 
here. The cross-correlation function and one-sided cross-spectral density function are (see 
Equations (8.119) and (8.121)) 

R xy (j) = \aC cos(2jrpr - 9 xy ) and G xy {f) = ^-8(f - p)e~ ]6 » 


where 9 xy = 9 X — 9 y . 
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Line 

MATLAB code 

Comments 

i 

clear all 

Same as in MATLAB Example 8.4, except 

2 

A=l; C=2; D=2; thetax=0; 

that ‘T’ is increased by 10 times for better 


thetay=-pi/4; phi=pi/2; n=2; 

estimation of the cross-correlation function. 

3 

p=l; w=2*pi*p; fs=200; 
T=1000; t=0:l/fs:T-l/fs; 


4 

x=A*sin(w*t+thetax); 


5 

y=C*sin(w*t+thetay)+D*sin(n*w*t+phi); 


6 

maxlag=4*fs; 


7 

[Rxy, tau]=xcorr(y,x,maxlag, 'unbiased'); 


8 

tau=tau/fs; 


9 

f=fs*(0:maxlag-l)/maxlag; 

Define the frequency variable. 

10 

Rxy=Rxy (maxlag+ 1 :end- 1 ) ; 

Discard the negative part of r, i.e. we only 


% makes exactly four periods 

take R xy ( t) for r > 0. This makes it exactly 

11 

Sxy=fft(Rxy); 

four periods. Then, obtain S xy (f) via the DFT 
of the cross-correlation function. 

12 

format long 

The MATLAB command ‘format long’ 

13 

thetaxy=thetax-thetay 

displays longer digits. Display the value of 

14 

ind=find(f==p); 

0 xy = 6 X — 0 y which is tt/ 4, and find the index 

15 

arg_Sxy_at_p_Hz=angle(Sxy(ind)) 

of frequency p Hz in the vector ‘f* . 

Display the value of arg S xy {f ) at p Hz, and 
compare with the value of 6 xy . 


Results 


thetaxy=0.785 398 163 39745 
arg_Sxy_at_p_Hz=— 0.785 403 708 042 95 

Comments: This demonstrates that arg S xy (f) = — (0 X — 9 y ). We can see that the longer 
the data length (T), the better the estimate of R xy ( r) that results in a better estimate of 
S xy (f ). Note, however, that we estimate S xy (f) by Fourier transforming the product of 
the estimate of R xy ( r) and the rectangular window (i.e. the maximum lag is defined when 
R xy ( r) is calculated (see Lines 7 and 10 of the MATLAB code)). The role of window 
functions is discussed in Chapter 10. 


Example 8.9: Application of the cross-spectral density function (compare with 
MATLAB Example 8.6) 

Consider the same example as in MATLAB Example 8.6 (the wheeled vehicle), where 
the measured signal is 

x(t) = s(t) + n x (t) 
y it) = s(t - A) + n y (t) 
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and the cross-correlation function and the cross-spectral density function are 
R xy (r) = R ss ( T - A) and S xy (f) = ) = e - j27rfA S ss (f ) 


First, we shall estimate A using S xy (f). Then, by forming the ratio S xy (f)/S xx (f) we 
shall estimate the frequency response function H (/) from which the time delay A can 
also be estimated. Note that we are not using S xy (f)/S ss (f), but the following: 


S xy (f) = S xy (f ) 
S xx (f) S ss (f) + S llxnx (f) 


( 1 ) 


Since R nx „ x (r) is an even function, S„ x „ x (f) is real valued, i.e. arg S„ A (/) = 0. Thus, it 
can be shown that 


af g H(f) = arg H\{f) = -2tt/A 

Note that H\(f) may underestimate the magnitude of H(f) depending on the variance of 
the noise. However, the phase of H\(f) is not affected by uncorrelated noise, i.e. we can 
see that the phase of H\(f) is less sensitive to noise than the magnitude of H\(f). More 
details of the estimator H\(f) defined by Equation (1) will be discussed in Chapter 9. 


Line MATLAB code Comments 


1 clear all 

2 fs=500; T=100; t=0:l/fs:T-l/fs; 

3 randn('state',0); 

4 s=randn(size(t)); 

5 fc=100; [b,a]=butter(9,fc/(fs/2)); 

6 s=filtfilt(b,a,s); 

7 s=s-mean(s); s=s/std(s); 

% Makes mean(s)=0 & std(s)=l; 

8 delta=0.2; 

9 x=s(delta*fs+l:end); 

10 y=s(l:end-delta*fs); 

11 randn('state',l); 

nx= 1 *std(s)*randn(size(x)); 

12 randn('state',2); 
ny=l*std(s)*randn(size(y)); 

13 x=x+nx; y=y+ny; 

14 maxlag=fs; 

15 [Rxx, tau]=xcorr(x,x,maxlag, 'unbiased'); 

16 [Rxy, tau]=xcorr(y,x,maxlag, 'unbiased'); 

17 tau=tau/fs; 

18 f=fs*(0:maxlag-l)/maxlag; 

19 Rxy_l=Rxy(maxlag+l:end-l); 

20 Sxy=fft(Rxy_l); 


Same as in MATLAB Example 8.6, 
except that the sampling rate is reduced 
and the total record time ‘T’ is increased. 
Note that the delay time A = 0.2 seconds 
as before. 

Note also that the same number of lags is 
used for both autocorrelation and 
cross-correlation functions. 


Define the frequency variable. 

Discard the negative part of r, i.e. we 
only take R xy ( r) for r > 0. If we include 
the negative part of r when the DFT is 
performed, then the result is a pure 
delay. If this is the case, we must 
compensate for this delay. 
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21 figure(l) 

22 plot(f(l:maxlag/2+l), 
unwrap(angle(Sxy( 1 :maxlag/2+ 1 )))) 

23 hold on 

24 xlabel('Frequency (Hz)') 

25 ylabel('arg\itS_x_y\rm(\itf\rm) (rad)'); 
axis([0 fs/2 -160 0]) 

26 ind=find(f==fc); 

27 Pl=polyfit(f(2:ind), 
unwrap(angle(Sxy(2 :ind))), 1 ) ; 

28 format long 

29 t_delay 1=-P1 ( 1 )/(2*pi) 

30 plot(f(2:ind), Pl(l)*f(2:ind)+Pl(2), 'r:'); 
hold off 

31 N=2*maxlag; 

32 f=fs*(0:N-l)/N; 

33 Sxx=fft(Rxx(l:N)); Sxy=fft(Rxy(l:N)); 

34 % Sxx=fft(Rxx( 1 :N)). 
*exp(i*2*pi.*f*(maxlag/fs)); 

35 % Sxy=fft(Rxy(l:N)). 
*exp(i*2*pi.*f*(maxlag/fs)); 

36 Hl=Sxy./Sxx; 


37 figure(2) 

38 plot(f(l:maxlag+l), 
unwrap(angle(H 1(1: maxlag+ 1 )))) 

39 hold on 

40 xlabel('Frequency (Hz)') 

41 ylabel('arg\itH\rm_l(\itf\rm) (rad)'); 
axis([0 fs/2 -160 0]) 

42 ind=find(f==fc); 

43 P2=polyfit(f(2:ind), 
unwrap(angle(Hl(2:ind))), 1); 

44 t_delay 2=-P2( 1 )/(2 *pi) 

45 plot(f(2:ind), P2(l)*f(2:ind)+P2(2), 'r:'); 
hold off 


Then, obtain the cross-spectral density 
function via the DFT of the 
cross-correlation function. 

Plot the unwrapped arg S xy (f) up to half 
the sampling rate. Then hold the figure. 
We can see the linear phase characteristic 
up to about 100 Hz. Note that the signal 
is band-limited (0 to 100 Hz), thus the 
values above 100 Hz are meaningless. 

Find the index of the cut-off frequency 
(100 Hz) in the vector T. Then, perform 
first-order polynomial curve fitting to 
find the slope of the phase curve. 

Display the estimated time delay. Plot 
the results of curve fitting on the same 
figure, then release the figure. 

Calculate S xx (f ) and S xy (f) using the 
DFT of R xx (z ) and R xy {z), respectively. 
Since R xx (z) is an even function, we 
must include the negative part of r in 
order to preserve the symmetric 
property. Note that the last value of the 
vector Rxx is not included to pinpoint 
frequency values in the vector f. Then, 
estimate the frequency response function 
S xy (f)/S xx (f) (Line 36). 

As mentioned earlier, we must 
compensate the delay due to the 
inclusion of the negative part of r . 
However, this is not necessary for the 
estimation of the frequency response 
function, i.e. the ratio S xy (f)/S xx (f) 
cancels the delay if R xx (z) and R xy (z) 
are delayed by same amount. 

Lines 34 and 35 compensate for the 
delay, and can be used in place of Line 
33. 

Plot the unwrapped arg H\{f) up to half 
the sampling rate. Then hold the figure. 
Compare this with the previous result. 


Perform first-order polynomial curve 
fitting as before. 

Display the estimated time delay, and 
plot the results of curve fitting on the 
same figure, then release the figure. 
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Results 



(a) Results of using S xy (f) , (b) Results of using H x (f ) , 

t_delayl = 0.200 007 583 226 06 t_delay2 = 0.200 005 287 848 52 


Comments: Note that the two methods give almost identical results and estimate delay 
time A = 0.2 very accurately. 


Example 8.10: System identification using spectral density functions 

In the previous MATLAB example, we saw that H\(f) = S xy (f)/S xx (f) estimates the 
system frequency response function. Although we shall discuss this matter in depth in 
Chapter 9, a simple example at this stage may be helpful for understanding the role of 
correlation and spectral density functions. 

Consider the input-output relationship of a single-degree-of-freedom system in 
Figure (a). 


*(0 


h{t) = — e ^ " sin co d t 

a d 


y(f> 


(a) A single-degree-of-freedom system 


In this example, we use white noise as an input x(f), i.e. R xx ( r) = Ar5(r); then the 
output y(t ) is obtained by y(t) = h(t) * x(t). 

R xx (t), Ryy(j), R xy (j) and H\{f) = S xy (f)/ S xx (f ) are examined for two different 
values of measurement time. We shall see that the estimation results get better as the total 
record time T increases. 


Line 


MATLAB code 


Comments 


1 clear all 

2 fs=100; t=[0: l/fs:2.5- 1/fs] ; 

3 A=100; zeta=0.03; f=10; wn=2*pi*f; 
wd=sqrt( 1 -zeta~2)*wn; 
h=(A/wd)*exp(-zeta*wn*t).*sin(wd*t); 


Define parameters for the impulse 
response function h(t), and generate a 
sequence accordingly. Note that the 
impulse response is truncated at 2.5 
seconds. 


4 
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5 

randnfstate'.O); 

Generate a white noise signal for input 

6 

T=100; % 100 and 2000 

x(t). Note that the variance of x(t) is 

7 

x=2*randn( 1 ,T*fs); 

four (in theory). Then obtain the output 

8 

y=conv(h,x); y=y(l :end-length(h)+ 1); 

signal by convolution of h(t) and x(r). 

9 

% y=filter(h,l,x); 

First, run this MATLAB program using 
the total record time T = 100 seconds. 
Later run this program again using T 
= 2000, and compare the results. 

Note that the sequence ‘IT is an FIR 
filter, and Line 9 can be used instead of 
Line 8. 

10 

maxlag=2.5*fs; 

Calculate the correlation functions. 

11 

[Rxx, tau]=xcorr(x,x,maxlag, 'unbiased'); 

Note that we define the maximum lag 

12 

[Ryy, tau]=xcorr(y,y,maxlag, 'unbiased'); 

equal to the length of the filter h. 

13 

[Rxy, tau]=xcorr(y,x,maxlag, 'unbiased'); 


14 

tau=tau/fs; 


15 

N=2*maxlag; 

Calculate the spectral density 

16 

f=fs*(0:N-l)/N; 

functions. 

17 

Sxx=fft(Rxx( 1 :N)); 

Note that different scaling factors are 


Syy=fft(Ryy(l:N))/(fs“2); 

used for Sxx, Syy and Sxy in order to 

18 

Sxy =fft( Rxy( 1 :N))/fs; 

relate to their continuous functions (in 

19 

Hl=Sxy./Sxx; 

relative scale). This is due to the 

20 

H=fft(h,N)/fs; 

convolution operation in Line 8, i.e. 
the sequence ‘y’ must be divided by 
‘fs’ for the equivalent time domain 
signal y(t). 

Calculate H\(f) = S xy (f)/S xx (f ), 
and also calculate H( f ) by the DFT of 
the impulse response sequence. Then 
compare these two results. 

21 

figured) 

Plot the autocorrelation function 

22 

plot(tau.Rxx) 

R xx ( r). It is close to the delta function 

23 

xlabel('Lag (\it\tau)'); 

(but note that it is not a ‘true’ delta 


ylabel('\itR_x_x\rm(\it\tau\rm)') 

function), and R xx (0) « 4 which is the 
variance of x(t). 

24 

figure(2) 

Plot the autocorrelation function 

25 

plot(tau.Ryy) 

R yy (z). Note that its shape is reflected 

26 

xlabel('Lag (\it\tau)'); 
ylabel('\itR_y_y\rm(\it\tau\rm)') 

by the impulse response function. 

27 

figure(3) 

Plot the cross-correlation function 

28 

plot(tau.Rxy) 

R xy (z). Note that its shape resembles 

29 

xlabel('Lag (\it\tau)'); 
ylabel('\itR_x_y\rm(\it\tau\rm)') 

the impulse response function. 

30 

figure(4) 

Plot the magnitude spectrum of both 

31 

plot(f(l:N/2+l), 

H\(f ) and H(f ) (in dB scale), and 


20*log 1 0(abs(H 1 ( 1 :N/2+ 1)))) ; hold on 

compare them. 

32 

xlabel('Frequency (Hz)'); 
ylabel('| \itH\rm_ 1 (\itf\rm) | (dB)') 


33 

plot(f( 1 :N/2+ 1), 20*logl0(abs(H( 1 :N/2+ 1))), 
'r:'); hold off 
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34 

figure(5) 

Plot the phase spectrum of both H\(f) 

35 

plot(f(l:N/2+l), 

and H(f), and compare them. Run 


unwrap(angle(Hl(l:N/2+l)))); hold on 

this MATLAB program again for T = 

36 

xlabel('Frequency (Hz)'); 
ylabel('arg\itH\rm_l (\itf\rm) (rad)') 

2000, and compare the results. 

37 

plot(f( 1 :N/2+ 1 ), unwrap(angle(H( 1 :N/2+ 1 ))), 
'r:'); hold off 



Results 


T = 100 


T = 2000 



Lag (r) 

(bl) Autocorrelation function, R^t) 



- 2.5 -2 - 1.5 -1 - 0.5 0 0.5 1 1.5 

Lag (r) 

(cl) Autocorrelation function, R^r) 



Lag (r) 

(b2) Autocorrelation function, Ryy{ r) 


Lag (r) 

(c2) Autocorrelation function, Ryy( t) 



axgH x f) (rad) l#i(/ll (dB) 
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Comments: 

1. By comparing the results of using T = 100 and T = 2000, it can be seen that as the 
length of data (T ) increases, i.e. as the number of averages increases, we obtain better 
estimates of correlation functions and frequency response functions. 

Note particularly that the cross-correlation function R xy (r) has a shape sim- 
ilar to the impulse response function h(t). In fact, in the next chapter, we shall 
see that R xy (j) = kh{ r) where k is the variance of the input white noise. To see 
this, type the following script in the MATLAB command window (use the result of 
T = 2000): 


plot(t,4*h); hold on 

plot(tau(maxlag+l:end), Rxy(maxlag+l:end), 'r:'); hold off 
xlabelfTime (s) and lag (\it\tau\rm)'); ylabel('Amplitude') 


The results are as shown in Figure (d). Note that h(t) is multiplied by 4 which 
is the variance of the input white noise. (Note that it is not true white noise, 
but is band-limited up to ‘fs/2\ i.e. fs/2 corresponds to B in Equation (8.97) and 
Figure 8.30.) 
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Also note that the autocorrelation function of the output is the scaled version of 
the autocorrelation function of the impulse response function, i.e. R yy (r) — kRhhi r). 
Type the following script in the MATLAB command window to verify this: 


Rhh=xcorr(h,h,maxlag); 

plot(tau,4*Rhh); hold on 

plot(tau, Ryy, 'r:'); hold off 

xlabel('Lag (\it\tau\rm)'); ylabel('Amplitude') 

The results are shown in Figure (e). Note that !?/,/,( r) is not normalized since h(t) is 
transient. 



(e) Comparison of Ri,i, (r) and Ryy (r) 


Note that, in this MATLAB example, the system frequency response function is 
scaled appropriately to match its continuous function. Flowever, the correlation and 
spectral density functions are not exactly matched to their continuous functions, they 
are scaled relatively. 

2. In this example, we have estimated the spectral density functions by taking the Fourier 
transform of correlation functions. However, there are better estimation methods such 
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as the segment averaging method (also known as Welch’s method (Welch, 1967)). 
Although this will be discussed in Chapter 10, the use of Welch’s method is briefly 
demonstrated. Type the following script in the MATLAB command window (use the 
result of T =100): 


Sxx_w=cpsd(x,x, hanning(N),N/2, N, fs, 'twosided'); 
Sxy_w=cpsd(x,y/fs, hanning(N),N/2, N. fs, 'twosided'); 
Hl_w=Sxy_w./Sxx_w; 
figure) 1) 

plot(f(l:N/2+l), 20*logl0(abs(Hl_w(l:N/2+l)))); hold on 
plot(f(l:N/2+l), 20*logl0(abs(Hl(l:N/2+l))), 'r:'); hold off 
xlabeK'Frequency (Hz)'); ylabel('|\itH\rm_l(\itf\rm)| (dB )') 
figure(2) 

plot(f(l:N/2+l), unwrap(angle(HLw(l:N/2+l)))); hold on 
plot(f(l:N/2+l), unwrap(angle(Hl(l:N/2+l))), 'r:'); hold off 
xlabel('Frequency (Hz)'); ylabel('arg\itH\rm_l(\itf\rm) (rad)') 


The MATLAB function ‘cpsd’ estimates the cross-spectral density function using 
Welch' s method. In this MATLAB script, the spectral density functions are estimated 
using a Hann window and 50 % overlap. Then, the frequency response function es- 
timated using Welch’s method is compared with the previous estimate (shown in 
Figures (b4) and (b5)). Note that the output sequence is divided by the sampling rate, 
i.e. ‘y/fs’ is used in the calculation of cross-spectral density ‘Sxy_w’ to match to its 
corresponding continuous function. 

The results are shown in Figures (fl) and (f2). Note that the result of using 
Welch's method is the smoothed estimate. This smoothing reduces the variability, but 
the penalty for this is the degradation of accuracy due to bias error. In general, the 
smoothed estimator underestimates peaks and overestimates troughs. The details of 
bias and random errors are discussed in Chapter 10. 



(fl) Magnitude spectrum of H^if) 
using Welch’s method 


(f2) Phase spectrum of //,( /) 
using Welch’s method 
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Linear System Response to Random 
Inputs: System Identification 


Introduction 

Having described linear systems and random signals, we are now able to model the 
response of linear systems to random excitations. We concentrate on single-input, single- 
output systems, but can also consider how additional inputs in the form of noise on 
measured signals might affect these characteristics. We shall restrict the systems to be 
linear and time invariant, and all the signals involved to be stationary random processes. 
Starting with basic input-output relationships, we introduce the concept and interpretation 
of the ordinary coherence function. This leads on to the main aim of this book, namely 
the identification of linear systems based on measurements of input and output. 


9.1 SINGLE-INPUT, SINGLE-OUTPUT SYSTEMS 

Consider the input-output relationship depicted as in Figure 9.1, which describes a linear 
time-invariant system characterized by an impulse response function h(t), with input x(t) and 
output y(t). 

If the input starts at to, then the response of the system is 

t 

y(f) = x(t)*h(t) — J h(t — t\)x(t\)dt\ (9.1) 

to 

If we assume that the system is stable and the response to the stationary random input x(t) 
has reached a steady state, i.e. y(t) is also a stationary process for to —> — oo, then Equation 
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Input 


x(t) 


m 

y(t) 

System 



Output 


Figure 9.1 A single-input, single-output system 


(9.1) can be written as 


y(0 = 


/ 


— oo 


h(t — t\)x(t\)dt\ 


OO 

J h(r)x(t — r)dr 

o 


(9.2) 


Whilst Equation (9.2) describes fully how input x{t) is related to the corresponding 
response y(t), it is more helpful to develop relationships relating the first and second moments 
of the input and response. We shall do this in both the time and frequency domains. We shall 
include mean values, (auto and cross-) correlation functions and (power and cross-) spectral 
density functions. 


Mean Values 


If the mean value of input x(t ) is /jl x , then the mean value of the output y(t), fi y , may be 
obtained by taking expectations of Equation (9.2), i.e. 


fi y = E [ v(0] = E 


OO 

J h(r)x(t — T)dr 

_o 


(9.3) 


The expectation operation is linear and so the right hand side of Equation (9.3) can be 
written as 


OO 

J h(r)E [x(t — r)] dr 
o 


oc 

/ 


h(r)fi x dt = [i x h(r)dr 


OO 

S' 


So it follows that 


OO 


/X-y fA X 


J h(T)dr 
0 


(9.4) 


From this it is clear that if the input has a zero mean value then so does the output 
regardless of the form of h(r). 


It will be convenient to assume that the signals have zero mean values in what follows. 
This keeps the equations from becoming unwieldy. 
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Autocorrelation Functions 


If x(t ) is characterized by its autocorrelation function R xx ( r) it is logical to ask how the 
output autocorrelation R yy ( t) is related to the R xx ( r). So we need to evaluate R yy ( r) = 
E[y{t)y(t + r)], where y(t) is given by Equation (9.2). This follows, where again we 
exploit the linear nature of the expected value operator which allows it under integral 
operations: 


R yy (r) = E[y(t)y(t + r)] = E 


oo oo 

If 

Lo o 


h(X\)x(t — T\)h(T 2 )x(t + r — T 2 )dX\dX 2 


oo oo 

// 


h(x\)h{x 2 )E [x{t - x\ )x(t + x - x 2 )\dx\dx 2 


(9.5) 


Thus, 


Ryy(x) 


OO OO 


If 

0 0 


h(x\)h(x 2 )R xx (x + ti — x 2 )dxidx 2 


(9.6) 


This is rather complicated and is a difficult equation to evaluate, and we find that the frequency 
domain equivalent is more useful. 


Taking the Fourier transform of Equation (9.6) gives 

OO 

Syy(f)= J R yy (x)e-W'dx 

— OO 

OO OO OO 

= J h(xi) e i 2nftl dxi J h(.x 2 ) e - i2nfXl dx 2 J R xx (x + n - x 2 ) e - j2,znz+T ^ Z2) dx 

0 0 — oo 

(9.7) 


Let r + Tj — x 2 = u in the last integral to yield 

Syy(f)=\H(f)\ 2 S XX (f) (9.8) 


where H(f) = J 0 °°/t( x)e i 2zz f z dx is the system frequency response function. (Recall that 
the Fourier transform of the convolution of two functions is the product of their transforms, 
i.e. Y(f) = F{y(t ) } = F{h(t)*x(t)} = H(f)X(f) which gives \Y(f)\ 2 = \H{f)\ 2 \X(f)\ 2 , 
and compare this with above equation.) 
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We see that the frequency domain expression is much simpler than the corresponding 
time domain expression. Equation (9.8) describes how the power spectral density of the input 
is ‘shaped’ by the frequency response characteristic of the system. The output variance is the 
area under the S yy (f) curve, i.e. the output variance receives contributions from across the 
frequency range and the influence of the frequency response function is apparent. 


Cross-relationships 


The expression in Equation (9.8) is real valued (there is no phase component), and shows 
only the magnitude relationship between input and output at frequency /. The following 
expression may be more useful since it includes the phase characteristic of the system. 

Let us start with the input-output cross-correlation function R xy (x) — E[x(t)y(t + 
r)]. Then 


R xy (x) = 


E[x{l)y{t + t)] = E 


OO 

J x(t)h(x\)x(t + x — x\)dx\ 
_0 


oo 

J h(xi)E[x(t)x(t + r — xi)]dxi 
o 


(9.9) 


i.e., 


R xy (x) = 


OO 

f h(x\)R XX (r - x l )dx 1 
o 


(9.10) 


Whilst Equation (9.10) is certainly simpler than Equation (9.6), the frequency domain 
equivalent is even simpler. 

The Fourier transform of Equation (9. 10) gives the frequency domain equivalent as 


OO OO OO 

s xy (f) = I R xy {x)e- j2nfz dx = I hlxOe-^'dx! J R xx {x - xi^^-^dx 


(9.11) 


thus 


S xy (J) = H(f)S xx (f) 


(9.12) 


Equation (9.12) contains the phase information of the frequency response function such 
that arg S xy (f) — arg H{f). Thus, this expression is often used as the basis of system identi- 
fication schemes, i.e. by forming the ratio S xy (f)/ S xx (f) to give H(f). Note also that if we 
restrict ourselves to / > 0, then we may write the alternative expressions to Equations (9.8) 
and (9.12) as 


Gyy(f)=\H(f)\ 2 G xx (f) 


(9.13) 
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and 


G xy (f) = H(f)G xx (f) 


(9.14) 


Examples 

A First-order Continuous System with White Noise Input M91 

Consider the first-order system shown in Figure 9.2 where the system equation is 

Ty(t) + y(t) = x(t) T> 0 (9.15) 


Input At) 

THO +y(f)=x(t) 

yit) 

(white noise) 

System 



■ Output 


Figure 9.2 A first-order continuous system driven by white noise 

We shall assume that x(t) has zero mean value and is ‘white’, i.e. has a delta function 
autocorrelation which we write R xx (x) = a 2 S(x). The impulse response function of the system 
is 

1 


h(t) = —e , ( r t > 0 


the transfer function is 


H(s) = 


1 


1 + Ts 


and the frequency response function is 

H( J ) = , , l r T 

1 + j2nfT 

Using Equation (9.6), the autocorrelation function of the output is 


R yy (r) = 


h(X\)h{X 2 )(T 2 &{x + Tl - X 2 )dXidx 2 


If 

0 0 

oo oc 

= o 2 J h(n) j 


h(x 2 )S(x + Ti - x 2 )dx 2 dx\ 


CXJ 

='■{ 


h(x\)h(x + x\)dx\ - o 2 R hh {x) 


(9.16) 


(9.17) 


This shows that for a white noise input, the output autocorrelation function is a scaled version 
of the autocorrelation function formed from the system impulse response function. 
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From Equation (9.8), the power spectral density function of the output is 


Syy(f)= 1 S xx {f) = 


1 

l + (27r/T) 2 ' 


(9.18) 


Note that 


OO OO OO 

Ryy(.0) = a 2 = ct a 2 j h 2 (x,)dx x =ot f \H(f)\ 2 df = j S yy (f)df 
0 — oo — oo 

i.e. the output variance is shaped by the frequency response characteristic of the system and 
is spread across the frequencies as shown in Figure 9.3. A filter operating on white noise in 
this way is often called a ‘shaping filter’. 



Figure 9.3 Power spectral density functions of input and output for the system in Figure 9.2. 


Cross-relationships 

Consider the cross-spectral density and the cross-correlation functions which can be written 
as 


Sxylf) = H(f)S xx (f) = ‘ O 2 

1 + ]2txj T 


(9.19) 


R X y(x) = 


/ 2 

h(xi)o 2 S(x - xi)dxi = a 2 h{x) = ^ ^ 
o 


= 0 


T > 0 
T < 0 


(9.20) 


From these two equations, it is seen that, if the input is white noise, the cross-spectral 
density function is just a scaled version of the system frequency response function, and 
the cross-correlation function is the impulse response function scaled by the variance of 
the input white noise (see also the comments in MATLAB Example 8.10 in Chapter 8). 
This result applies generally (directly from Equations (9.10) and (9.12)). Accordingly 
white noise seems the ideal random excitation for system identification. These results are 
theoretical and in practice band-limiting limits the accuracy of any identification. 
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A First-order Continuous System with a Sinusoidal Input M9 2 

Consider the same system as in the previous example. If the input is a pure sine function, e.g. 
x(t) = A sin(2jr/of + 9) with the autocorrelation function R xx ( x) = (A 2 /2) cos(27r/or) (see 
Section 8.6), the power spectral density function S xx (f) = (A 2 /4)|)5(/ — /o) + S(f + /o)] 
and the variance cr 2 — A 2 / 2, then the power spectral density function of the output is 


Syy(f)= \H(f)\ 2 S xx (f) = 


1 


l + (2 nfT) 2 4 
The variance and autocorrelation function of the output are 

no 

A 2 


[«(/ - f 0 ) + S(f + /„)] 


-f 


<= / S y Jf)df = 


1 


Ryyir) = F- 1 {$„(/)} = 


2 1 + {2nf 0 T) 2 
A 2 1 


2 1 + (2nf 0 T) 2 
The cross-spectral density and the cross-correlation functions are 

1 A 2 


cos(27r/or) 


S xy (f) = H{f)S xx {f) = 


1 + j2nfT 4 


[S(f-fo) + S(f + f 0 )] 


R X y( T) = F- 1 {s xy (f)i = 


2^1 + (2nf 0 T) 2 


sin(2jr/oT + (p) 


(9.21) 

(9.22) 

(9.23) 

(9.24) 

(9.25) 


where 


4> = tan 1 ( ^ 

\2nfoT ) 

Thus, it can be seen that the response to a sinusoidal input is sinusoidal with the same frequency, 
the variance differs (Equation (9.23) with r = 0) and the cross-correlation (9.25) shows the 
phase shift. 


A Second-order Vibrating System 

Consider the single-degree-of-freedom system shown in Figure 9.4 where the equation 
of motion is 


my{t ) + cy(t) + ky(t) = x(t) 


; C 

> T| 

1 

m 

N -U 

> A * A 

; k 



Force, x(t) 
Displacement, y(t) 


(9.26) 


Figure 9.4 A single-degree-of-freedom vibration system 
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The impulse response function of the system is 

h(t) = e ~ i<0nt sin a>dt t > 0 

maij 

and the frequency response function is 


H(f) = 


1 

k - m{2nf) 2 + jc(2nf) 


where a>„ = , /k/m , f = c/2rau)„ and coj = u> nx J 1 — f 2 . 

If the input is white noise with the autocorrelation function R xx ( r) = a 2 S(z), then 
the power spectral density and the cross-spectral density functions of the output are 


Syy(f)=\H(f)\ 2 S xx (f) = 


and 


S X y{f) = H{f)S xx {f) = 


J a 2 

[k - ra(27r/) 2 ] 2 + [c(2nf)] 2 * 

1 


k — m(2nf) 2 + jc(2nf) 
The cross-correlation function of the output is 

2 

R xy (t) = o 2 h{ r) = — sincurfi x > 0 


mco d 


= 0 


r < 0 


(9.27) 


(9.28) 


(9.29) 


which is a scaled version of the impulse response function (see the comments in MATLAB 
Example 8.10). 


This and the other examples given in this section indicate how the correlation functions 
and the spectral density functions may be used for system identification of single-input, single- 
output systems (see also some other considerations given in Appendix C). Details of system 
identification methods are discussed in Section 9.3. 


9.2 THE ORDINARY COHERENCE FUNCTION 


As a measure of the degree of linear association between two signals (e.g. input and 
output signals), the ordinary coherence function is widely used. The ordinary coherence 
function (or simply the coherence function) between two signals x(t) and y(t) is defined as 


\G X y(f )\ 2 \S X y(f )\ 2 

G XX (f)G„(f) S XX (f)Syy(f) 


(9.30) 


From the inequality property of the spectral density functions given in Chapter 8, i.e. 
\S xy (f )\ 2 < S xx (f)S yy (f), it follows from Equation (9.30) that 


0 < y xy (f) < 1 


(9.31) 
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If x(t ) and y(t) are input and output signals, then S yy (f) = \H{f)\ 2 S xx (f) and 
S xy (f) = H(f)S xx (f). So the coherence function for the single-input, single-output sys- 
tem given in Figure 9. 1 is shown to be 


2 c*2 


vUf) = 


I g(/)rs 

&x(/)l«(/)| 2 ^x(/) 


= 1 


(9.32) 


Thus, it is shown that the coherence function is unity if x{t) and y{t) are linearly related. 
Conversely, if S xy (f ) is zero, i.e. the two signals are uncorrelated, then the coherence 
function is zero. If the coherence function is greater than zero but less than one, then 
x(t) and y(t) are partially linearly related. Possible departures from linear relationship 
between x(t) and y(t) include: 


1. Noise may be present in the measurements of either or both x(t) and y(t). 

2. x(t) and y(t) are not only linearly related (e.g. they may also be related nonlinearly). 

3. y(t) is an output due not only to input x(t) but also to other inputs. 

Since y xy if) is a function of frequency its appearance across the frequency range can be 
very revealing. In some ranges it may be close to unity and in others not, e.g. see Figure 
9.5, indicating frequency ranges where ‘linearity’ may be more or less evident. 



Figure 9.5 A typical example of the coherence function 


Effect of Measurement Noise 
Case (a) Output Noise 

Consider the effect of measurement noise on the output as shown in Figure 9.6, where y m ( t ) is 
a measured signal such that y m (t) = y(t) + n y (t). We assume that input x(t) and measurement 
noise n y (t) are uncorrelated. Since y(t) is linearly related to x(t), then y(t) and n y (t) are also 
uncorrelated. Then, the coherence function between x(t) and y m (t) is 


rL(f) = 


\S xym (f)\ 2 

S XX (f)S ym y m (f) 


(9.33) 


x(t) 



noise 

(+) ► y m (0 


Figure 9.6 The effect of measurement noise on the output 
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where S ym y m if) = S yy {f) + S nyKy if) = \H(f )\ 2 S xx (f) + S nyHy if). Using the standard 
input-output relationship, i.e. 


s xym if) = s xy {f) + s xn m = s xy (f) = H(f)s xx (f) 


the coherence function becomes 

\H{f )\ 2 S 2 xx {f) 


Y x \ff) = 


1 


SxAf)\\H(f )\ 2 S xx (f) + S ny „ y (f )\ 


Sn y n y (f) 
\H{f)\ 2 S xx {f) 


1 + 


(9.34) 


(9.35) 


So 


YxAf) = 


Syyif) 


Syyif) 


Sn y n,{f) S y Jf) + S n „ (f) S ymym (f) 


j I "y'‘y 

Syy(f) 


(9.36) 


From Equation (9.36), it can be seen that the coherence function y 2 y (/) describes how much 
of the output power of the measured signal y„,(t) is contributed (linearly) by input x(t). Also, 
since the noise portion is 

Sn y n y {f) _ Sy m y m (f) — Syyif ) _ j _ ^ 2 (y-j 


,(/) 


.(/) 


the quantity 1 — y 2 (/) is the fractional portion of the output power that is not due to input x(t). 


Thus, a useful concept, called the coherent output power ( spectral density function), 
is defined as 

Syyif) = Y X ySf)Sy m y m if) (9.37) 

which describes the part of the output power fully coherent with the input. In words, the 
power spectrum of the output that is due to the source is the product of the coherence 
function between the source and the measured output, and the power spectrum of the 
measured output. Similarly, the noise power (or uncoherent output power) is the part of 
the output power not coherent with the input, and is 

Sn y n y (f) = [1 - y^CO] Sy m y m (f) (9.38) 


The ratio S yy {f)/ S„ v „ (/) is the signal-to-noise ratio at the output at frequency /. If this 


is large then yf (/) —*■ 1, and if it is small then yf (/) —*■ 0, i.e. 


(/) 1 as 


Syyif) 

Sn y n y if) 


oo and 


Yxyif) ' 


0 as 


Syyif) 

Sn,n y if) 


(9.39) 


Case (b) Input Noise 

Now consider the effect of measurement noise (assumed uncorrelated with x(/)) on the input 
as shown in Figure 9.7, where x m it) is a measured signal such that x m {t) = x)t) + n x (t) and 
&-*-(/) — Sxxif) + S nx n x if). Then, similar to the previous case, the coherence function 
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x(t) 

njf) 


1 

-O 

1 

X m( f ) 


m 


m 


Figure 9.7 The effect of measurement noise on the input 
between x m (t) and y(f) is 


1 


S xx (f) 


2 ( " ^xx\j j _ Sxx (/) 

Yx " y J ~ 1 + S -f^£ ~ S xx (f) + S„ s nSf) ~ S XmX Jf ) 

Thus, the input power and noise power can be decomposed as 


and 


s„(f) = y Xmy (f)s x ,„ Xm (f) 
s„ x nAf) = [i - rlAf)] s XmXm (f) 


(9.40) 


(9.41) 

(9.42) 


Case (c) Input and Output Noise 

Consider the uncorrelated noise at both input and output as shown in Figure 9.8, where x m (t) 
and y m (t) are the measured input and output. 


*«■ 


«,(0 ■ 


i 

►© 

i 

x n (t) 


h(t) 


— — >- y(t) 

I 

yj t) 


Figure 9.8 The effect of measurement noise on both input and output 


The noises are assumed mutually uncorrelated and uncorrelated with x(t). Then the coherence 
function between x m (t) and y,„(t) becomes 
|2 


y 2 

/ x m y m 


(/) = 




i 


S XmX Af)Sy m y m (f) 


^ Sn,n x (f) Sn y n y (f) Sn x n x (f)Sn y n y (.f) 

S XX (f) Syyif ) S XX (f)S yy (f) 


(9.43) 


Note that, in this case, it is not possible to obtain the signal powers S xx (f ) and S„(f) 
using the measured signals x m (t) and y m (t) without knowledge or measurement of the noise. 
Some comments on the ordinary coherence function are given in Appendix D. 


9.3 SYSTEM IDENTIFICATION 1 ™ 4 9 5 

The objective of this section is to show how we can estimate the frequency response 
function of a linear time-invariant system when the input is a stationary random process. 
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It is assumed that both input and output are measurable, but may be noise contaminated. 
In fact, as we have seen earlier, the frequency response function H(f) can be obtained by 
forming the ratio S xy (f)/S xx (f ) (see Equation (9.12)). However, if noise is present on 
the measured signals x m (t) and y m (t) as shown in Figure 9.8, then we may only ‘estimate’ 
the frequency response function, e.g. by forming S Xm y m (f)/ S XmXm (f). In fact we shall now 
see that this is only one particular approach to estimating the frequency response from 
among others. 

Figure 9.8 depicts the problem we wish to address. On the basis of making mea- 
surements x,„(t) and y m (t), which are noisy versions of input x(f) and response y(t), we 
wish to identify the linear system linking x and y. 

To address this problem we begin by resorting to something very much simpler. 
Forget for the moment the time histories involved and consider the problem of trying to 
link two random variables X and Y when measures of this bivariate process are available 
as pairs ( x , , y,), i = 1, 2, . . . , N. Suppose we wish to find a linear relationship between 
x and y of the form y = ax. We may plot the data as a scatter diagram as shown in 
Figure 9.9. The parameter a might be found by adjusting the line to ‘best-fit’ the scatter 
of points. In this context the points (x;, y,) could come from any two variables, but to 
maintain contact with Figure 9.8 it is convenient to think of x as an input and y as the 
output. With reference to Figure 9.9, the slope a is the ‘gain" relating x to y. 

To find the slope a that is best means deciding on some objective measure of closeness 
of fit and selecting the value of a that achieves the ‘optimal’ closeness. So we need some 
measure of the ‘error’ between the line and the data points. 



Figure 9.9 Scatter diagram relating variable x (input) and y (output) 

We choose to depict three errors that characterize the ‘distance’ of a data point from 

the line. These are shown in Figure 9.10: 

• Case 1 : The distance (error) is measured in the y direction and denoted e y . This assumes 
that offsets in the x direction are not important. Since x is identified with input and 
y with output, the implication of this is that it is errors on the output that are more 
important than errors on the input. This is analogous to the system described in Case 
(a), Section 9.2 (see Figure 9.6). 

• Case 2: In this case we reverse the situation and accentuate the importance of offsets 
(errors) in the x direction, e x , i.e. errors on input are more important than on output. 
This is analogous to the system described in Case (b), Section 9.2 (see Figure 9.7). 

• Case 3: Now we recognize that errors in both x and y directions matter and choose an 
offset (error) measure normal to the line ej. The subscript T denotes ‘total’. This is 
analogous to the system described in Case (c), Section 9.2 (see Figure 9.8). 
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y y 





Figure 9.10 Scatter diagram; three types of error, e y , e x and e T 

For each of the cases we need to create an objective function or cost function 
from these measures and find the slope a that minimizes the cost function. There are 
an unlimited set of such functions and we shall choose the simplest, namely the sum of 
squared errors. This results in three least squares optimisation problems as follows. 


Case 1: Errors in the Output, e y (i.e. X; are known exactly but y f are noisy) 

In this case, we will find the parameters that fits the data such that y = a\x and minimizes 
the sum of squares of errors Xl/li ( e y) 2 > where the error is defined as e' y = y ; — s*;- We 
form an objective function (or a cost function) as 


J i = i J2 (4) 2 = 4 J2 O’* - a I*'-) 2 (9.44) 

1=1 1=1 

and minimize /i with respect to a\. J\ is a quadratic function of s> and has a single 
minimum located at the solution of dJ\/da\ = 0, i.e. 


dj i 
da\ 


2 N 

— ^2 (>’; - aiXtX-Xi) = 0 

^ i=t 


(9.45) 


Thus, the parameter a\ is found by 


a l 



(9.46) 


Note that, if the divisor N is used in the numerator and denominator of Equation (9.46), 
the numerator is the cross-correlation of two variables x and y, and the denominator is 
the variance of x (assuming zero mean value). If N is large then it is logical to write a 
limiting form for ci\ as 


E [xvl <r rv 

S = — = — T (for zero mean) (9.47) 

E |x 2 J tr 2 

emphasizing the ratio of the cross-correlation to the input power for this estimator. 

Case 2: Errors in the Input, e x (i.e. _y,- are known exactly but X; are noisy) 
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Now we find the parameter s for y = Q 2 X. The error is defined as e\ = x,- — y,- /m, and 
we form an objective function as 


h 





Then, minimizing J 2 with respect to 02 , i.e. 


dJ2 

da2 




= 0 


(9.48) 


(9.49) 


gives the value of parameter 02 as 


ai 



(9.50) 


Note that, in contrast to Equation (9.46), the numerator of Equation (9.50) represents the 
power of the output and the denominator represents the cross-correlation between x and 
y. Again, taking a limiting case we express this 


«2 


E b 2 ] 

E [xyl 


= — — (for zero mean) 

a xy 


(9.51) 


Case 3: Errors in Both Input and Output, ej (i.e. both variables x; and y, are noisy) 


In this case, the error to be minimized is defined as the perpendicular distance to the line 
y = ajX as shown in Figure 9.1 1. This approach is called the total least squares (TLS) 
scheme, and from the figure the error can be written as 


yt - ciTXi 



(9.52) 



sin# = 


X ~ a T x t jrJl + ai 


Figure 9.11 Representation of error ej normal to the line y = ajX 
Then the cost function Jj is 

N r x2 

{yi - a T Xi) 


y r = i£(4) 2 =i£^ 

N 4-f v T ’ N 4—t ( 


N ( 1 +a r) 


(9.53) 
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This is a non-quadratic function of aj and so there may be more than one extreme value. 
The necessary conditions for extrema are obtained from 


dJr _ _j_ y^ 2 (y,- - g r Y, ■)(-*,■) _ J_ 4 (y, - a T Xi) 2 (2a T ) _ 

da T n (i + 4) N h (i + 4) 2 

This yields a quadratic equation in aj as 

N / N N \ N 

4 X) x <y< +a r \J2 x ? -J2 y i ) ~ X = 0 

1=1 \i=l i=l / i=l 


(9.54) 


(9.55) 


Thus, we note that the non-quadratic form of the cost function J T (9.53) results in two 
possible solutions for aj, i.e. we need to find the correct aj that minimizes the cost 
function. We resolve this as follows. If we consider N large then the cost function J T can 
be rewritten as 


Jt = e [(4) 2 ] = 


= E 


(yi - a T Xif 


Oy + 44 ~ la T a xy 


1 + Cl J 

(for zero mean) 


Then equation (9.55) becomes 

a.jOxy + cit 4 A (Jy j cx X y = 0 
The solutions of this are given by 

(4 - 4) ± 7 (4 - 4) 2 + 4 4y 


Gj = 


20 x 


(9.56) 


(9.57) 


(9.58) 


A typical form of the theoretical cost function (Equation (9.56)) may be drawn as in 
Figure 9.12 (Tan, 2005), where +-4 and —4 denote the solutions (9.58) with positive 
and negative square root respectively. 



Figure 9.12 Theoretical cost function J T versus aj 


From this we see that the correct parameter aj (in a limiting form) for y = ajx 
that minimizes the cost function Jj is given by the solution (9.58) which has the positive 
square root, i.e. 


CIt 




'IcTxy 


(9.59) 
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Accordingly, the correct parameter aj for Equation (9.55) is 


N N 

,2 ,-2 


Gj = 


i = 1 i=l 


N N 


E 4 - E 4 + r E 4 - E 4 + 4 E +- y, 


i=l i=l 


N 

2 E Xj yi 

i = 1 


(9.60) 


Frequency Response Identification 


The relationship between x(t) and y(t) in the time domain is convolution (not a simple gain) - 
but it becomes a gain through the Fourier transform, i.e. Y (/) = H(f)X(f), and the previous 
gain a is now the complex-valued H(f) which is frequency dependent. 

With reference to Figure 9.8, i.e. considering measurement noise, suppose we have a 
series of measured results X mi (/) and Y mi (f). The index i previously introduced now implies 
each sample realization that corresponds to each sample time history of length T . Accordingly, 
the form 


used in the previous analysis can be replaced by 


1 \X mi (f) I 2 

N E-f T 

1 = 1 


As we shall see in Chapter 10, this is an estimator for the power spectral density function of 

x m( 0, t-C. 



Similarly, the cross-spectral density between x m (t) and y m (t) can be estimated by 


(9.61) 



T 


(9.62) 


These results introduce three frequency response function estimators based on a\, 02 and aj. 
A logical extension of the results to complex form yields the following: 


1. Estimator Hfif): Based on Equation (9.46), 

n In 

«i = XX + / XX 2 

1=1 / 1=1 
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the estimator Hff) is defined as 

Hi if) = « (9-63) 

h.x m x m ij ) 

We shall see later that this estimator is unbiased with respect to the presence of output 
noise, i.e. ‘best’ for errors (noise) on the output. Once again, the limiting (theoretical) 
version of this is 

Hff) = (9-64) 

^X m X m \J ) 

This estimator is probably the most widely used. 

2. Estimator Hiif): Based on Equation (9.50), 

N / N 

ai = & 2 / Jl x <y< 

i=l / i=i 

the estimator H 2 (/) is defined as 

h 2 (/) = (9 - 65) 

3 v m x m ij ) 

This estimator is known to be ‘best’ for errors (noise) on the input, i.e. it is unbiased 
with respect to the presence of input noise. Note that the denominator is S ymXm (f) (not 
S Xm y m (f)). This is due to the location of the conjugate in the numerator of Equation (9.62), 
so Sy mXm (/) must be used to satisfy the form H{f) — Y(f) / X(f) (see Appendix E for 
a complex-valued least squares problem). Similar to the Hff) estimator, the theoretical 
form of Equation (9.65) is 

Hiif) = S ^ yS ( f i (9-66) 

y m x m \J ) 

3. Estimator Hwif) (also known as H s {f) or Hff)): This estimator is sometimes called 
the total least squares estimator. It has various derivations with slightly different forms - 
sometimes it is referred to as the H v {f) estimator (Leuridan et al., 1986; Allemang and 
Brown, 2002) and as the H s (f) estimator (Wicks and Void. 1986). Recently, White et al. 
(2006) generalized this estimator as a maximum likelihood (ML) estimator. We denote 
this ML estimator H w (f), which is 



2 S ymXm (f) 


(9.67) 


where fc(/) is the ratio of the spectra of the measurement noises, i.e. K(f) = 

S„ y n y (,f)/Sn x nAf). 

This estimator is ‘best’ for errors (noise) on both input and output, i.e. it is unbiased 
with respect to the presence of both input and output noise provided that the ratio of 
noise spectra is known. Note that if K(f) — 0 then H w (f ) = Hff), and if <(f) -*■ oo 
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then Hwif) — * Hff) (see Append! 
to know K(f). In this case, K(f) — 1 
in the input signal is the same as in the 
the solution of the TLS method whit 
use the notation H T (f) where the sut 

KySf) ~ Sx m x m if) + L 

J-f r ( V 

x F for the proof). In practice, it may be difficult 
may be a logical assumption, i.e. the noise power 
: output signal. If so, the estimator H w ( f) becomes 
:h is often referred to as the H v {f) estimator (we 
iscript T denotes ‘total’), i.e. 

‘[Sx m xSf) ~ ~Sy m y m if)f + 4 \~S Xmym if)\ 2 

L m jmjm j 1 1 /Q 

n T\j ) 

Note that this is analogous to aj defir 
is 

H r( n — v 

-- l y .UO / 

2 Sy^Jf) 

led in Equation (9.60). The theoretical form of this 

'[S XmXm if) ~ S M Sf)} 2 + 4 \S Xmym if)\ 2 

(9 69^ 


2 S ymXm if) 


The Biasing Effect of Noise on the Frequency Response Function Estimators 
Hff) and Hff) 


First, consider the effect of output noise only as described in Figure 9.6. The Hff) estimator 
is 


S X y m if) _ S xy if) T S xn ff) 
Sxx(f) - S XX (f) 


Sxyif) 

S xx (f ) 


= H(f) 


(9.70) 


Thus, H i (/) is unbiased if the noise is present on the output only. We assume that appropriate 
averaging and limiting operations are applied for this expression, i.e. theoretical spectral 
density functions are used. Now, consider the H 2 (f) estimator which becomes 


H 2 if) = 


Sy m yM) 

Sy m xif) 


Syyif) + Sn yn ff) 

Syx(f) 


— H{f) 



Sn y n y {f)\ 
Syyif) ) 


(9.71) 


Note that this estimator is biased and overestimates H{f) if the output noise is present, depend- 
ing on the signal-to-noise ratio of the output signal (it may be different for each frequency). 
If the input is white noise, then the input power spectral density function S xx {f) is constant 
over the entire frequency range while the output power spectral density S yy (f) varies as the 
frequency changes, depending on the frequency response characteristics. 

Now consider the case when only the input noise is present as shown in Figure 9.7. The 
Hff) and H 2 (f) estimators are 


Sx m yif) 


H\{f) = w 
H 2 if ) = 


Sx m X, 

S. 


if) 
yyif) 


S xy = Hjf) 

Sxxif)+S„ xni 1 + S n ,n x /S xx (J) 


Syyif) 


= Syyif) 

Sy Xm if) S yx (f)+S ynx if) S yx (f ) 


= H(f) 


(9.72) 

(9.73) 


Thus, it is shown that H 2 (f) is unbiased with respect to input noise while Hff) is biased and 
underestimates H{f) if the input noise is present. Note that the bias of the Hff) estimator 
depends on the ratio S ni „ x (f)/S xx (f). If both noise and input signal are white noise, then the 
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ratio is constant for all frequencies, i.e. the H\{f) estimator becomes simply a scaled version 
of the true frequency response function H(f) (see MATLAB Example 9.4, Case (b)). 


Example M94 

Consider a system (with reference to Figure 9.8) that displays resonant and anti-resonant 
behaviour, i.e. as shown in Figure 9.13. 





Figure 9.13 A system with resonant and anti-resonant behaviour 

Assume that both input and response are noise contaminated. The input and output 
signal-to-noise ratios (SNRs) are S xx (f)/ S„ x „ x (f) and S yy (f)/ S ny „ y (f). Also, assume 
the noises are white. 

Whilst the input SNR is unaffected by the system response, the output SNR is 
largest at resonance (/ r ) and smallest at anti-resonance ( f ar ). Accordingly the ‘errors’ 
at the output are (relatively) more significant at f ar than f r , so estimator H\{f) is more 
appropriate than H 2 (f) for this frequency. Conversely, at frequency f r the output SNR 
is high, and so errors on input may be more significant and therefore H^if) may be more 
appropriate. 

Thus, H\(f) usually underestimates the frequency response function at resonances 
of the structure but gives better estimates at anti-resonances than /^(Z)- On the other 
hand, as mentioned earlier, H 2 ( f) is relatively unbiased at resonances but significantly 
overestimates near the anti-resonances (see MATLAB Example 9.4, Case (a)). Thus, 
when both input and output noise are present the TLS estimator H T {f) (or H w (f ) if 
K(f) can be measured) may be preferably used (see MATLAB Example 9.4, Case (c)). 
Alternatively, a combination of frequency response function estimates Hi (/), LZ(/) an d 
Hj{f) may also be used for different frequency regions appropriately. 


Note that the biasing effect of noise on the estimators Hff), /Z)/) and Hwlf ) is limited 
to the magnitude spectrum only, i.e. the phase spectrum is unaffected by uncorrelated noise 
and is not biased. This can be easily verified from Equations (9.71) and (9.72), where S„ y „ y (f), 
Syy(.f)’ S„ xltx (f) and S xx (f ) are all real valued. Thus, it follows that 


arg S Xm y m (f) = arg 


„(/) 


= arg Hff) = arg H 2 (f) = arg H w (f) = arg H(f) 

(9.74) 
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This result indicates that the phase spectrum is less sensitive to noise. However, note that this 
is a theoretical result only. In practice we only have an estimate S Xin y m (f ). Thus, the phase 
spectrum also has errors as we shall see in Chapter 10. 


The Effect of Feedback 

In some situations, there may be feedback in a dynamical system, as shown for example in 
Figure 9.14. The figure might depict a structure with frequency response function H{f), with 
x(t ) the force and y(t) the response (e.g. acceleration). The excitation is assumed to come from 
an electrodynamic shaker with input signal r(t). The force applied depends on this excitation 
but is also affected by the back emf (electromotive force) effect due to the motion. This is 
modelled as the feedback path G(f). A second input (uncorrelated with r(f)) to the system is 
modelled by the signal n{t). This could come from another (unwanted) excitation. 


«(0 


r(t)- 




x(t) 


H(.f) 


z(t) 


•O- 


■y(0 




G(f ) 



At) 


Figure 9.14 A system with feedback 


The objective is to determine H(f), i.e. the forward path frequency response function us- 
ing the measured signals x(t) (force) and y(t) (acceleration). Simply using the Hi (/) estimator 
turns out not to be helpful as the following demonstrates. 

In this case, X(f) and Y{f) can be written as 


X(f) = R(f) + G{f)Y(f) = 


R(f) + G(f)N(f) 
1 - H(f)G(f) 


Y{f) = H(f)X(f) + N(f) = 


N(f)+ H(f)R(f) 
1 - H(f)G(f) 


(9.75) 


Thus, the H\ (/) estimator based on the measured signals x(t) and y(t) gives 


Sxy(f) = H(f)S rr (f) + G*(f)S„„(f) 
s xx (f) S rr (f) + \G(f)\ 2 S nn (f) 


(9.76) 


which is not the required H{f). Rather than determining H(f), note that as the noise power 
gets large, Hi(f) estimates the inverse of G(/), i.e. 


Hff) = 


S xy (f) 

s xx (f ) 


1 Srr(f) 
G{f) ‘ S nn (f) 


(9.77) 


It is clear, however, that in the absence of disturbance n(t), H t (/) does indeed result in H( f) 
even in the presence of feedback. 

From this we see that (if the additional input n(t) is present) we need another approach 
and this was provided by Wellstead (1981), who proposed using a third signal, namely the 
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excitation to the shaker r(t). In essence he proposed an estimator referred to here as 

n 3 (f) = S y. f l (9.78) 

OrxVJ ) 

i.e. the ratio of two cross-spectral density functions. Equation (9.75) can be rearranged as 


X(f) [1 - G(f)H(f)} = R(f) + G(f)N(f) 
Y(f ) [1 - G(f)H(f)] = H(f)R(f) + N(f) 

Then, the cross-spectral density functions are 


Srr(f) 

1 - G(f)H(f) 


and S ry (f) = 


H(f)S rr (f ) 

1 - G(f)H(f) 


So H 2 (f) = H(f) even in the presence of disturbance n(t) and feedback. 


(9.79) 


(9.80) 


9.4 BRIEF SUMMARY 

1 . The input-output relationship in the time domain for a stationary random process x(t) 


R yy (z) = 


h(Ti)h(x 2 )R X x(z + Ti — z 2 )dzidr 2 and 


IJ 

0 0 
oo 

R xy (z) = J h(z\)R xx (z —zi)dz\ 


and the corresponding frequency domain expressions are 

Syy(f) = \H(f)\ 2 S xx (f ) and S xy (f) = H(f)S xx (f) 

2. If the input x(t) is white noise, then (for zero mean values) 

R yy (r) = o x R h h( z) and R xy ( r) = a x h(z) 

3. The ordinary coherence function between input x(t) and output y(t) is defined as 

i2 


*</>- |s -’ (/>l 


0 £ Y xv (f) < 1 


S„(/)S W (/) 

which measures the degree of linearity between x(t) and y(t). 

4. When the effect of measurement noise on the output is considered, the coherent output 
power is defined as 

Syy(f) = Y X y m (f)Sy m y m (f) 


298 


LINEAR SYSTEM RESPONSE TO RANDOM INPUTS 


and the noise power (or uncoherent output power) is defined as 

Sn y n y (f) = [1 - Yxyjf)] Sy m y m (f) 

5. Power spectral and cross-spectral density functions can be estimated by 


(/)=-£— — and S x y 
" J N “ T ” y " 


' N ^ T 


6. The frequency response function is estimated by: 

»(/) 


(a) Hi(f) = 

S XmXm (f) 

which is unbiased with respect to the output noise; 

(b) h 2 (J) = ?y=y=Ul 

Sy m x m (f) 

which is unbiased with respect to the input noise; 

KyJf) ~ K(f)S XmX Jf) + j[S XmX JfMf) - Sy„yjf )] 2 + 4 \S Xm yJf)\ 2 K(.f) 

(c) Hw ( f ) = =■ 

2 Sy mXm (f) 

where <(f) = S ny „ y (f)/S„ xnii (f). This is unbiased with respect to both input and 
output noise. If /c(/) is unknown, tc(/) = 1 may be used. 


9.5 MATLAB EXAMPLES 


Example 9.1: System identification using spectral density functions: a first-order 
system 

Consider the following first-order system (see Equation (9.15)) 

Ty(t) + y(t) = x(t) 

where the impulse response function is h(t) = (l/T)e^'^ T and the frequency re- 
sponse function is H(f) — 1/(1 + jlnfT). In this example, we use the band-limited 
white noise as an input x(t); then the output y(t) is obtained by the convolution, i.e. 
y(t) = h(t) * x(t). The spectral density functions S xx (f) and S xy (f) are estimated using 
Welch’s method (see Chapter 10 for details, and also see Comments 2 in MATLAB 
Example 8.10). 

Then, we shall estimate the frequency response function based on Equation (9.12), 
S xy (f) = H(f)S xx (f), i.e. Hi(f) = S xy (f)/S xx (f). This estimate will be compared 
with the DFT of h(t) (we mean here the DFT of the sampled, truncated impulse response 
function). 

In this MATLAB example, we do not consider measurement noise. So, we note that 
Hi (/) = H 2 (f) = H T (f). 
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Line MATLAB code Comments 


1 clear all 

2 fs=500; Tl=l; T2=40; tl=0:l/fs:Tl; 
t2=0:l/fs:T2-l/fs; 

3 T=0.1; 

4 h=l/T*exp(-tl/T); 

5 randn('state',0); 

6 x=randn(l,T2*fs); 

7 fc=30; [h, a] = butter(9,fc/(fs/2)); 

8 x=filter(b,a,x); 

9 x=x-mean(x); x=x/std(x); 

10 y=conv(h,x); y=y(l:end-length(h)+l); 
% or y=filter(h,l,x); 

1 1 y=y/fs; 


12 N=4*fs; % N=10*fs; 

13 Sxx=cpsd(x,x, hanning(N),N/2, N, fs, 
'twosided'); 

14 Syy=cpsd(y,y, hanning(N),N/2, N, fs, 
'twosided'); 

15 Sxy=cpsd(x,y, hanning(N),N/2, N, fs, 
'twosided'); 

16 Sxx=fftshift(Sxx); Syy=fftshift(Syy); 
Sxy=fftshift(Sxy); 

17 f=fs*(-N/2:N/2-l)/N; 

18 Hl=Sxy./Sxx; 

19 H=fftshift(fft(h,N))/fs; 

20 Gamma=abs(Sxy)72./(Sxx.*Syy); 

21 figure! 1) 

22 plot(f, 10*logl0(Sxx)) 

23 xlabel('Frequency (Hz)'); 
ylabel('\itS_x_x(\itf\rm) (dB)') 

24 axis([-30 30 -35 -5]) 

25 figure(2) 

26 plot(f, 10*logl0(abs(Sxy))) 

27 xlabel('Frequency (Hz)'); 
ylabel('|\itSjt_y(\itf\rm)| (dB)') 

28 axis([-30 30 -35 -15]) 

29 figure(3) 

30 plot(f, 20*logl0(abs(Hl))); hold on 

31 xlabel('Frequency (Hz)'); 
ylabel('|\itH\rm_l(\itf\rm)| (dB)') 

32 plot(f, 20*logl0(abs(H)), 'r;'); hold off 

33 axis([-30 30 -30 5]) 


Define sampling rate and time variables tl 
(for the impulse response function h(t)) and 
t2 (for the band-limited white noise input 
x(t)). Then, generate the impulse response 
sequence accordingly which is truncated at 
1 second. 

Generate the band-limited white noise input 
signal x(t). The cut-off frequency is set to 
30 Hz for this example. The input sequence 
has a zero mean value, and the variance is 
one. 

Then, obtain the output sequence by a 
convolution operation. Note that the output 
sequence y is scaled by the sampling rate in 
order to match its corresponding continuous 
function. 

Calculate the spectral density functions 
using Welch’s method; we use a Hann 
window and 50 % overlap. The length of 
segment is defined by N, and is 4 seconds 
long in this case. 

Note that we defined both negative and 
positive frequencies (Line 17), thus the 
MATLAB function ‘fftshift’ is used to shift 
the zero-frequency component to the centre 
of spectrum. 

Calculate H,(f) = S xy (f)/S xx (f), and also 
calculate //(/) using the DFT of the impulse 
response sequence. Also, compute the 
coherence function. 

Plot the ‘calculated (estimated)’ power 
spectral density function and the magnitude 
spectrum of cross-spectral density function, 
for the frequency range —30 to 30 Hz. 

Note that these functions are only estimates 
of true spectral density functions, i.e. they 
are S xx (f ) and S xy (f). So, we may see some 
variability as shown in the figures. Note that 
we use ‘10*logl0(Sxx)’ and 
‘10*logl0(abs(Sxy))’ for dB scale, since the 
quantities are already power-like. 

Plot the magnitude spectrum of both H\(f) 
and H (/) for the frequency range —30 to 
30 Hz. 


(ap) (fl^s (ap) IWlffl 
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34 

figure(4) 

Plot the phase spectrum of both H\(f) and 

35 

plot(f, unwrap(angle(Hl))); hold on 

//(/) for the frequency range —30 to 30 Hz. 

36 

xlabel('Frequency (Hz)'); 
ylabel('arg\itH\rm_l (\itf\rm) (rad)') 


37 

plot(f, unwrap(angle(H)), 'r:'); hold off 


38 

axis([-30 30-1.6 1.6]) 


39 

figure(5) 

Plot the coherence function. 

40 

plot(f, Gamma) 


41 

xlabel('Frequency (Hz)'); 
ylabel('Coherence function') 


42 

axis([-150 150 0 1.1]) 



Results 




(b) Estimate of Sxy (/) | 




(e) Coherence function (/) 
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Comments: 

1. Note that the coherence function y xy (f) & 1 within the frequency band of interest, 
except at the peak (i.e. at zero frequency). The drop of coherence function at / = 0 is 
due to the bias error. This bias error can be reduced by improving the resolution (see 
Chapter 10 for details). To improve the resolution, the length of the segment must be 
increased (but note that this reduces the number of averages). For example, replace the 
number 4 with 10 in Line 12 of the MATLAB code. This increases the window length 
in the time domain, thus increasing the frequency resolution. The result is shown in 
Figure (f), where the coherence function is almost unity including the value at / = 0. 



(f) Coherence function y 2 (/) using segments 
10 seconds long 


2. From Figures (c) and (d), we see that we have an almost perfect estimate for the 
frequency response function from Hi (f)=S xy (f)/S xx (f). Flowever, the individual 
spectral density function estimates show a large variability as shown in Figures (a) 
and (b). Note that, in theory, S xx (f) is constant and S xy (f) is a scaled version of H{f) 
if the input is white noise. 

It is emphasized that the errors are not due to the noise (we did not consider 
measurement noise in this example). In fact, these are the statistical errors inherent in 
the estimation processes (see Chapter 10). By comparing Figures (a)-(c), we may see 
that the estimate of H (/) is less sensitive to the statistical errors than the estimates of 
spectral density functions. This will be discussed in Chapter 10. 

Note that, even if there is no noise, the estimate of H(f) may have large statistical 
errors if the number of averages (for the segment averaging method) is small. To 
demonstrate this, change the length of time (T2) in Line 2 of the MATLAB code, i.e. 
let T2 = 6. The result is shown in Figure (g), where we see relatively large random 
errors near the peak. 



(g) Magnitude spectrum of H^f), for T2 = 6 
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Example 9.2: A first-order continuous system with a sinusoidal input 

Consider the same first-order system as in MATLAB Example 9.1. Now, the input is a 
sine function, i.e. 

A 2 

x(t) = A s'm(2nfot ) and R xx (r) = — cos(27r/or) 

Then, the output y(t) can be written as 

A , 

y(t) = sin(27r/ 0 ? + 0) 9 = tan 1 (— 2jr/ 0 r) 

a/ 1 + (2w/o T) 2 

In section 9.1, we have seen that the autocorrelation and cross-correlation functions are 
(see Equations (9.23) and (9.25)) 

A 2 1 A 2 

R v Jt) = cos(2n fnr) and R xv < r) = — s.'m(2nfor + <*) 

W 2 1 + (27T/0D 2 j0 xA 2yJ\ + (2nfoT) 2 J 

where / i \ 

<t> = tan- 1 ( ) 

\2nf 0 T ) 

In this example, we shall verify this. 


Line 

MATLAB code 

Comments 

i 

clear all 

Same as in MATLAB Example 9.1, 

2 

fs=500; Tl=l; T2=40; tl=0:l/fs:Tl; 

except that the input is now a 1 Hz sine 


t2=0:l/fs:T2-l/fs; 

function. 

3 

T=0. 1 : 


4 

h=l/T*exp(-tl/T); 


5 

A=2; f=l; w=2*pi*f; 


6 

x=A*sin(w*t2); 


7 

y=filter(h,l,x)/fs; 


8 

maxlag=2*fs; 

Define the maximum lag and calculate 

9 

[Ryy, tau]=xcorr(y,y,maxlag, 'unbiased'); 

the correlation functions. 

10 

[Rxy, tau]=xcorr(y,x,maxlag, 'unbiased'); 


11 

tau=tau/fs; 


12 

phi=atan(l/(w*T)); 

Calculate the true R vv (t) and R xx ( r) 

13 

Ryy_a=(A~2/2)*(l./(l+(w*T).~2)).*cos(w*tau); 

using Equations (9.23) and (9.25). 

14 

Rxy_a=(A~2/2)*( 1 ,/sqrt( l+(w*T).“2)). 
*sin(w*tau+phi); 


15 

figured) 

Plot both estimated and true 

16 

plot(tau,Ryy,tau,Ryy_a, 'r:') 

autocorrelation functions (Ryy and 

17 

xlabel('Lag (\it\tau)'); 
ylabel('\itR_y_y\rm(\it\tau\rm)') 

Ryy_a, respectively). 

18 

figure(2) 

Plot both estimated and true 

19 

plot(tau,Rxy,tau,Rxy_a, 'r:') 

cross-correlation functions (Rxy and 

20 

xlabel('Lag (\it\tau)'); 
ylabel('\itR_x_y\rm(\it\tau\rm)') 

Rxy_a, respectively). 
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Results 



Lag (r) 

(a) Autocorrelation function Ryy ( r ) 

(solid line: estimated; dashed line: true function) 



(b) Cross-correlation function Rxy (r) 

(solid line: estimated; dashed line: true function) 


Comments: As mentioned in Section 9.1, this example has shown that the response to a 
sinusoidal input is the same sinusoid with scaled amplitude and shifted phase. 


Example 9.3: Transmission path identification 


We consider the simple acoustic problem as shown in Figure 9. 15. Then, we may model 
the measured signal as 

Mic. A = x(t) = as(t) 

(9.81) 

Mic. B = y(t) = bs(t — A t ) + cs(t — A 2 ) 
where Ai and A 2 are time delays. 


HarH rpflprtnr 



Figure 9.15 A simple acoustic example: transmission path identification 

If the source signal s(t) is broadband, then the autocorrelation function R ss ( t) is 
narrow as depicted in the figure. By treating x(f) as an input and y(t) as an output, 
i.e. y(t) =h(t ) * x(t), as we have seen in Chapter 4, the impulse response function and 
frequency response function are given by 

b c 

h{t) = -S(t - Ai) + -S(t - A 2 ) (9.82) 

a a 

H(f) = [ 1 + A-A)] (9.83) 

alb J 
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We now establish the delays and the relative importance of paths by forming the 
cross-correlation between x(t) and y(t) as 


R xy (r) = E[x(t)y{t + r)] = abR ss { r - AO + acR ss (z - A 2 ) 
This may be drawn as in Figure 9.16. 


(9.84) 




Figure 9.16 Cross-correlation function between x(t) and y(t) 


Note that Ai and A 2 are identified if R ss (r) is ‘narrow’ compared with A 2 — A 1; 
and the relative magnitudes yield b/c. If the source signal has a bandwidth of B as shown 
in Figure 9.17, then the autocorrelation function of s(t) can be written as 

sin(7rBr) 

R ss ( r) = AB cos(27t/ot) (9.85) 

?r Br 

Thus, in order to resolve the delays, it is required that (roughly) A 2 — Ai > 2 / B. 


S B (J) 


A ‘ 

B 

1 

; ! ; 2 

■ 1 ■ 

■ 1 1 

■ 1 ■ 

■ 1 ■ 

■ 1 ■ 

■ 1 ■ 

■ 1 ■ 


L 


Figure 9.17 Power spectral density function of the band-limited signal s(t) 


Time domain methods as outlined above are probably best - but we might also look 
at frequency domain methods. First, consider the cross-spectral density function for the 
simpler case of no reflector, i.e. as shown in Figure 9.18. 

x(t) = as(t) Delay . A] y(t) = bs(t- Aj) 

| 

Mic. A, x(t) Mic. B, y(t) 

Figure 9.18 A simple acoustic example with no reflector 

Then, R xy {z) = abR ss (r — Ai) and S xy (f) = abe~ J27I f Al S ss (f)- So arg S xy (f) 
gives the delay (see also MATLAB Example 8.9), but it turns out that the phase is more 
sensitive to other reflections (not the uncorrelated noise) than the correlation function. 

Now reconsider the first problem (with a hard reflector). Suppose that y(f) is noise 
contaminated, i.e. y m (t) = y(t) + n(t). If n(t) is independent of y(t) then R x y m (r) = 
R x y (r) and SxyM) = Sxy(f)- Thus, from Equation (9.84), the cross-spectral density 
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function is 

S X y m (f) = s xy (f) = abe- j2nfA ' [l + Vf 2 -/(A2— A,)] SsAf) (9.86) 

So the delay information is contained in the phase. However, unlike the single delay 
problem arg S xy (f) shows a mixture of two delay components. As will be seen later, 
although it is possible to identify both delays Ai and A 2 from the cross-spectral density 
function, the frequency domain method is more difficult in this case. Also, consider the 
coherence function (see Equation (9.33)), which is 


2 n _ \S,ySff _ S yy (f) S yy (f ) 

Xy " S XX (f)Sy m y m (f) Syy(f)+S nn (f ) Sy m y m (f) 

For convenience, let b = c and At — Ai = A; then 

S„(f) = 2 b 2 [1 + cos(2tt/A)] J„(/) 


(9.87) 

(9.88) 


So we see that y 2 v (/) = 0 at certain frequencies (/ = 11 / 2A, n — 1, 3, 5, . . .), i.e. 
the coherence collapses owing to destructive interference (i.e. the measurement SNR 
becomes very low). 

In the above, we considered both individual transmission paths as non-dispersive. 
(Note that the two paths taken together are dispersive, i.e. the group delay is —dcp/dco ^ 
const.) In practical cases, we must first decide whether the paths are dispersive or non- 
dispersive. If dispersive, the propagation velocity varies with frequency. In such cases, 
broadband methods may not be successful since waves travel at different speeds. In 
order to suppress the dispersive effect the cross-correlation method is applied for narrow 
frequency bands, though this too has a smearing effect. 


We now examine the transmission path identification problem described above, where 
the measured signal is 

x(t) = as(t) 


y„,(t ) = y(t) + n(f) = bs(t - Ai) + cs(t - At) + n(t) 

The cross-correlation function and the cross-spectral density function are 

R X y m ( r) = E[x(t)y m (t + r)] = abR ss ( x - Ai) + acR ss ( x - A 2 ) 

and 

Sx y Jf) = [abe-™* 1 +ace-J^] S„(f ) 

In this example, we shall compare the time domain method (using the cross-correlation 
function) and the frequency domain method (using the cross-spectral density function). 


Line MATLAB code Comments 


1 clear all 

2 fs=100;T=502;t=0:l/fs:T-l/fs; 

3 randn('state',0); 

4 s=randn(size(t)); 

5 fc=10; [h.a] = butter(9,fc/(fs/2)); 

6 s=filtfilt(b,a,s); 


Define sampling rate and time variable. 
Generate a band-limited white noise 
signal, where the (full) bandwidth 
(equivalent to B in Ligure 9.17) is 
approximately 20 Hz (— /„ to [, ). 
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7 s=s-mean(s); s=s/std(s); 

% Makes mean(s)=0 & std(s)=l; 

8 a=l; b=0.8; c=0.75; deltal=l; delta2=1.5; 
% delta2=1.07; 

9 Nl=2*fs; N2=T*fs-Nl; 

10 x=a*s(Nl+l:Nl+N2); 

11 yl=b*s(Nl-(deltal*fs)+l:Nl- 
(deltal*fs)+N2); 

12 y2=c*s(Nl-(delta2*fs)+l:Nl- 
(delta2*fs)+N2); 

13 y=yl+y2; 

14 randn('state',10); 

15 n=randn(size(y))*0.1; 

16 y=y+n; 

17 maxlag=2*fs; 

18 [Rxy, tau]=xcorr(y,x,maxlag, 'unbiased'); 

19 tau=tau/fs; 

20 Tl=50; 

21 [Gxx, f]=cpsd(x,x, hanning(Tl*fs),Tl*fs/2, 
Tl*fs, fs); 

22 [Gyy, f]=cpsd(y,y, hanning(Tl*fs),Tl*fs/2, 
Tl*fs, fs); 

23 [Gxy, f]=cpsd(x,y, hanning(Tl*fs),Tl*fs/2, 
Tl*fs, fs); 

24 Gamma=abs(Gxy).~2./(Gxx.*Gyy); 

25 figure) 1) 

26 plot(tau(maxlag+ 1 :end),Rxy (maxlag+ l:end)) 

27 xlabel('Lag (\it\tau)') 

28 ylabel('Cross-correlation') 

29 axis([0 2 -0.2 0.8]) 


30 figure(2) 

31 plot(f,unwrap(angle(Gxy))) 

32 xlabel('Frequency (Hz)') 

33 ylabel('arg\itG_x_y\rm(\itf\rm) (rad)') 

34 axis([0 15 -90 0]) 


35 figure(3) 

36 plot(f.Gamma) 

37 xlabel('Frequency (Hz)'); 
ylabel('Coherence function') 

38 axis([0 15 0 1]) 


Define parameters for signals x(t) and 
y(t). 

Also, define time delays, Ai = 1 and 
A 2 = 1.5. Note that A 2 — Ai = 0.5. 
Later, use A 2 = 1 .07, and compare the 
cross-correlation functions. 

Generate signals x(t) and y(t). Also, add 
some noise to the signal y(t). 


Calculate the cross-correlation function. 


Calculate the (one-sided) spectral density 
functions and the coherence function. 


Plot the cross-correlation function. As 
shown in Figure (a), Ai = 1 and 
A 2 = 1.5 are clearly identified. However, 
if A 2 = 1 .07 is used, it is not possible to 
detect the delays as shown in Figure (d). 
Note that the bandwidth of the signal s(t) 
is approximately 20 Hz, thus it is required 
that A 2 — Ai >0.1 for this method to be 
applicable. 

Plot the phase spectrum of the 
cross-spectral density function G xy (f). 

As shown in Figure (b), the phase curve 
is no longer a straight line, but it has a 
‘periodic’ structure. In fact, the relative 
delay A 2 — A i can be found by 
observing this periodicity as described in 
the figure, while Aj can be obtained from 
the overall slope of the phase curve. 
Compare this phase spectrum with that of 
a single delay problem (see MATLAB 
Example 8.9). 

Plot the coherence function. Note that the 
coherence drops owing to the interference 
between two delay components (see 
Equations (9.87) and (9.88)). 
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Results 




(a) Cross-correlation R (r), A, = 1 and A, = 1.5 


Frequency (Hz) 

(b) Phase spectrum of G (/) 



(c) Coherence function 


(d) Cross-correlation (r), A l = 1 and A 2 = 1 .07 


Comments: Note that the time domain method is much simpler and clearer. However, 
the signal must be wideband if the relative delay A 2 — Ai is small, otherwise the time 
domain method may fail as shown in Figure (d). 


Example 9.4: Frequency response function estimators Hii f) and H t {f) 

Consider the following impulse response function of a two-degree-of-freedom system: 

A I At 

hit) — ' sin&vi t H — e~ ?2 “" 2 ' sincu^t 

In this example, we use the white noise as an input x(t), and the output y(t) is obtained 
by y(t)=h{t) * x (t). We also consider the uncorrelated measurement noise. 

Three FRF estimators, H 2 (f), and Hrif), are compared for three different 

cases: Case (a), output noise only; Case (b), input noise only; and Case (c), both input 
and output noise. 
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Equations (9.63), (9.65) and (9.68) are used in this example, i.e. 


H l( /)=^), H2if)= S ^ f) 


H T {f) = 


S XmXm if) ' S ymXm (f) 

~Sy m y m (f) - S XmXm {f) + - ~Sy m y m {f )f + 4 | (/)| ' 


2 Sy mXm if) 

where the spectral density functions are estimated using the segment averaging method. 


Line MATLAB code Comments 


1 clear all 

2 A 1=20; A2=30; fl=5; f2=15; 
wnl=2*pi*fl; wn2=2*pi*f2; 

3 zeta 1=0.05; zeta2=0.03; 

4 wdl=sqrt(l-zetal~2)*wnl; 
wd2=sqrt(l-zeta2~2)*wn2; 

5 fs=50; Tl=10; tl=[0:l/fs:Tl-l/fs]; 

6 h=(Al/wdl)*exp(- 
zetal*wnl*tl).*sin(wdl*tl) + 
(A2/wd2)*exp(- 
zeta2*wn2*tl).*sin(wd2*tl); 

7 T= 50000; 

8 randn('state',0); 

9 x=randn(l,T*fs); 

10 y=filter(h,l,x); 

% we do not scale for convenience 


1 1 randn('state', 10); 

12 nx=0.5*randn(size(x)); 
% nx=0 for Case (a) 

13 randn(' state', 20); 

14 ny=0.5*randn(size(y)); 
% ny=0 for Case (b) 

15 x=x+nx; y=y+ny; 

16 clear nx ny 


17 [Gxx, f]=cpsd(x(l:T*fs),x(l:T*fs), 
hanning(Tl*fs),Tl*fs/2, Tl*fs, fs); 

18 [Gyy, f]=cpsd(y(l:T*fs),y(LT*fs), 
hanning(Tl*fs),Tl*fs/2, Tl*fs, fs); 

19 [Gxy, f]=cpsd(x(l:T*fs),y(l:T*fs), 
hanning(Tl*fs),Tl*fs/2, Tl*fs, fs); 

20 [Gyx, f]=cpsd(y(l:T*fs),x(l:T*fs), 
hanning(Tl*fs),Tl*fs/2, Tl*fs, fs); 

21 Hl=Gxy./Gxx; 

22 H2=Gyy./Gyx; 

23 HT=(Gyy-Gxx + sqrt((Gxx-Gyy)."2 + 
4*abs(Gxy).~2))./(2*Gyx); 

24 H=fft(h); 


Define parameters for the impulse response 
function h{t), and generate the sequence 
accordingly. The sampling rate is chosen as 
50 Hz, and the length of the impulse 
response function is 10 seconds. 


Define the length of input signal, and 
generate input white noise sequence ‘x’. 
Then obtain the output sequence ‘y\ 

Note that we define very long sequences to 
minimize random errors on the estimation of 
the spectral density functions. This will be 
discussed in Chapter 10. 

Generate the uncorrelated input 
measurement noise and output measurement 
noise. Note that we define the noise such that 
the variances of the input noise and the 
output noise are the same, i.e. /c(/)=l. Add 
these noises to the input and output 
appropriately. Then clear the variables ‘nx’ 
and ‘ny’ (to save computer memory). 

This script is for Case (c). Replace Line 12 
with ‘nx=0’ for Case (a), and replace Line 
14 with ‘ny=0’ for Case (b). 

Calculate the (one-sided) spectral density 
functions using the segment averaging 
method. 

Then calculate the frequency response 
function estimates H 2 (f) and H T (f). 

Note that H T (f ) = H w (f) since K(f ) = 1. 
Also calculate H(f) by the DFT of the 
impulse response sequence. Then compare 
the results. 
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25 

figure (1) 

Plot the magnitude spectrum of both 

26 

plot(f,20*logl0(abs(Hl)), 
f,20*logl0(abs(H(l:length(f)))), ’r:’) 

Hi(f) and H(f). 

27 

xlabel('Frequency (Hz)'); 
ylabel(' | \itH\rm_l (\itf\rm) | (dB)') 


28 

axis([0 25 -35 25]) 


29 

figure(2) 

Plot the magnitude spectrum of both 

30 

plot(f,20*log 1 0(abs(H2 )), 
f,20*logl0(abs(H(l:length(f)))), 'r:') 

H 2 (f) and H(f). 

31 

xlabel('Frequency (Hz)'); 
ylabel('|\itH\rmJ2(\itf\rm)| (dB)') 


32 

axis([0 25 -35 25] ) 


33 

figure(3) 

Plot the magnitude spectrum of both 

34 

plot(f,20*logl0(abs(HT)), 
f,20*logl0(abs(H(l:length(f)))), 'r:') 

HAf) and H(f). 

35 

xlabel('Frequency (Hz)'); 
ylabel('|\itH_T(\itf\rm)| (dB)') 


36 

axis([0 25 -35 25]) 



Results: Case (a) output noise only (Replace Line 12 with ‘nx=0’.) 
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Results: Case (b) input noise only (Replace Line 14 with 'ny=0'.) 





Results: Case (c) both input and output noise 
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Comments: 

1. We have demonstrated that H\{f) is unbiased with respect to the output noise (see 
Case (a)), H 2 (/) is unbiased with respect to the input noise (see Case (b)), and H T (f) 
is unbiased with respect to both input and output noise if /c(/) = 1 (see Case (c)). 
Note that, for all different cases, the Hj{f) estimator gives more consistent estimates 
of the frequency response function. Thus, the TLS estimator H T (f) (or H w (f) if 
K(f) is measurable) is highly recommended. However, in practical applications, it is 
always wise to compare all three estimators and choose the ‘best’ estimator based on 
some prior knowledge. 

As described in Equation (9.74) in Section 9.3, note that the phase spectrum of all 
three estimators is the same. To see this, type the following script in the MATLAB 
command window. The results are shown in Figure (d). 

figure(4) 

plot(f,unwrap(angle(H 1 )), f,unwrap(angle(H2)), f,unwrap(angle(HT)), 
f,unwrap(angle(H(l:length(f)))), 'k:') 
xlabeK'Frequency (Hz)'); ylabel('Phase spectrum (rad)') 



Frequency (Hz) 

(d) Phase spectra of H\ (/), H 2 (/) and H T (/) 

2. We note that the inverse DFT of the FRF estimate gives the corresponding estimated 
impulse response sequence. As mentioned in Chapter 6, this impulse response se- 
quence can be regarded as an MA system (i.e. an FIR filter). In this MATLAB 
example, it has 500 MA coefficients. In real-time signal processing (such as active 
control), it may be useful if the number of coefficients can be reduced, especially for 
the case of a large number of filter coefficients. One approach to this is by curve fit- 
ting the estimated FRF data to a reduced order ARMA model (see Equation (6. 12)). 
The basic procedure of the curve fitting algorithm can be found in Levi (1959). In 
MATLAB, the function called ‘invfreqz" finds the coefficients for the ARMA model 
based on the estimated frequency response function. Type the following script to 
find the reduced order ARMA model (we use the ARMA(4,4) model, which has 10 
coefficients in total): 

[b,a]=invfreqz(HT, 2*pi*f/fs, 4,4, [], 30); 

Hz=freqz(b,a,length(f),fs); 

figure(5) 

plot(f,20*logl0(abs(Hz)),f,20*logl0(abs(H(l:length(f)))), 'r:') 
xlabel('Frequency (Hz)'); ylabelt 'Magnitude spectrum (dB)') 
axis([0 25 -35 25]) 
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figure(6) 

plot(f,unwrap(angle(Hz)), f,unwrap(angle(H(l:length(f)))), 'r:') 
xlabel('Frequency (Hz)'); ylabel('Phase spectrum (rad)') 

The first line of the MATLAB script finds the coefficients for the ARMA(4,4) model 
based on the estimated FRF (we use the results of Hj{f) in this example), and the second 
line evaluates the frequency response H z (f) based on the coefficient vectors ‘a’ and ‘b’ 
obtained from the first line. Then, plot both magnitude and phase spectra of H z (f) and 
compare with those of H{f). The results are as in Figures (e) and (f). 




Example 9.5: A practical example of system identification 

We consider the same experimental setup used in MATLAB Example 6.7 (impact testing 
of a structure), except that we use a band-limited white noise for the input signal as shown 
in Figure (a). In this experiment, the frequency band of the signal is set at 5 to 90 Hz and 
the sampling rate is chosen as f s = 256 Hz. 



(a) Experimental set-up 

In this example, we will compare the results of three different FRF estimators, 

H 2 (/) and using the measured data stored in the file ‘beam.experiment.mat’. 1 


The data files can be downloaded from the Companion Website (www.wiley.com/go/shin_hammond). 
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Line 

MATLAB code 

Comments 

i 

clear all 

Load the measured data (x and y) which 

2 

load beam_experiment 

are recorded for 20 seconds. Define the 

3 

fs=256; T=4; 

sampling rate and the length of segment 

4 

[Gxx, f]=cpsd(x,x, hanning(T*fs),T*fs/2, 

(4 seconds). 


T*fs, fs); 

Calculate the (one-sided) spectral density 

5 

[Gyy, f]=cpsd(y,y, hanning(T*fs),T*fs/2, 

functions. We use a Hann window and 


T*fs, fs); 

50 % overlap - this gives nine averages 

6 

[Gxy, f]=cpsd(x,y, hanning(T*fs),T*fs/2, 

for each estimate. 


T*fs, fs); 

Then calculate the frequency response 

7 

[Gyx, f]=cpsd(y,x, hanning(T*fs),T*fs/2, 

function estimates H 2 (/) and 


T*fs, fs); 

HAf). 

8 

Hl=Gxy./Gxx; 


9 

H2=Gyy./Gyx; 


10 

HT=(Gyy - Gxx + sqrt((Gxx-Gyy).“2 + 
4*abs(Gxy)72))./(2*Gyx); 


11 

figure (1) 

Plot the magnitude spectra of 

12 

plot(f,20*log 1 0(abs(H 1 ))) 

H 2 (/) and H r (f) for the frequency range 

13 

xlabel('Frequency (Hz)'); 
ylabel(' | \itH\rm_l (\itf\rm) | (dB)') 

5 to 90 Hz. 

14 

axis([5 90 -45 25]) 


15 

figure (2) 


16 

plot(f,20*logl0(abs(H2))) 


17 

xlabel('Frequency (Hz)'); 
ylabel('|\itH\rm_2(\itf\rm)| (dB)') 


18 

axis([5 90 -45 25]) 


19 

figure (3) 


20 

plot(f,20*logl0(abs(HT))) 


21 

xlabel('Frequency (Hz)'); 
ylabel('|\itH_T(\itf\rm)| (dB)') 


22 

axis([5 90 -45 25]) 


23 

figure(4) 

Plot the phase spectra of H\{f), H 2 (/) and 

24 

plot(f(2 1:361), unwrap(angle(H 1(21 :361 ))), 

H r (f) for the frequency range 5 to 90 Hz. 


f(2 1 :36 1 ), unwrap(angle(H2(2 1:361 ))), 
f(2 1 :36 1 ), unwrap(angle(HT(2 1:361 )))); 

Note that they are almost identical. 

25 

xlabel('Frequency (Hz)'); 
ylabel('Phase spectrum (rad)') 


26 

axis([5 90 -7 0.5]) 
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LINEAR SYSTEM RESPONSE TO RANDOM INPUTS 


20 
10 

§ 0 

g-iO 

3. -20 

-30 
^10 

10 20 30 40 50 60 70 80 90 ' 10 20 30 40 50 60 70 80 90 

Frequency (Hz) Frequency (Hz) 

(b3) Magnitude spectrum of H r (f) (b4) Phase spectra of //, (/), H 2 (/) and H T (f) 




Comments: 

1 . As shown in Figure (b4), the phase spectra of and H T {f ) are the same. 

However, the results of magnitude spectra show that H\{f) considerably underes- 
timates at resonances compared with other estimates Hi(f) and Figure (c) 

shows the differences in detail. 



(c) Magnitude spectra of //j(/), H 2 {f ) and H T (f) 


Also, plot the coherence function by typing the following script. The result is shown 
in Figure (d). 


figure(5) 

Gamma=abs(Gxy).“2./(Gxx.*Gyy); 
plot(f, Gamma); axis([5 90 0 1.1]) 
xlabel('Frequency (Hz) 1 ); ylabel('Coherence function') 

Note that the value of the coherence function drops at resonances due to the bias 
error, which will be discussed in Chapter 10 (see also Comments 1 in MATLAB 
Example 9.1). 
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2. With reference to the experimental setup in Figure (a), there is another practical aspect 
to be considered. It is often the case that the anti-aliasing filters A and B introduce 
different delays. For example, if the filter B introduces more delay than the filter A, 
the phase spectrum becomes as shown in Figure (e). Thus, it is recommended that the 
same type of filter is used for both input and output. 



(e) Phase spectrum of FRF (delay in the filter B > delay in the filter A) 


10 

Estimation Methods and Statistical 
Considerations 


Introduction 

So far, we have discussed random processes in terms of ideal quantities: probability 
density functions, correlation functions and spectral density functions. The results in 
Chapter 9 used these theoretical concepts, and the MATLAB examples used estimation 
methods that anticipated what is presented in this chapter. In this chapter, we introduce 
statistical estimation methods for random signals based on a single realization (record) of 
the process, and show how the theoretical quantities may be estimated and the accuracy 
of these estimates. 

While omitting many mathematical justifications, the details of results quoted in 
this chapter may be found in many of the references, especially in Jenkins and Watts 
(1968) and Bendat and Piersol (2000). Readers who wish to know the details of statistical 
properties of random processes should refer to these two excellent texts. 


10.1 ESTIMATOR ERRORS AND ACCURACY 

Suppose we only have a single time record x(t) with a length of T, taken from a stochastic 
process. If we want to know the mean value /x x of the process, then a logical estimate 
(denoted by x) of fi x is 

T 

x = j J x(t)dt (10.1) 

0 
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The value obtained by Equation (10.1) is a sample value of a random variable, say X, 
which has its own probability distribution (this is called the sample distribution). And 
x is a single realization of the random variable X. Each time we compute a value x 
from a different length of record we get a different value. If our estimation procedure is 
‘satisfactory’, we may expect that: 

(i) the scatter of values of x is not too great and they lie close to the true mean value p . x ; 

(ii) the more data we use (i.e. the larger T). the better the estimate. 

We now formalize these ideas. Let cj> be the parameter we wish to estimate (i.e. <j> 
is the theoretical quantity, e.g. pt x above) and let be an estimator for <p. Then <I> is a 
random variable with its own probability distribution, e.g. as shown in Figure 10.1, where 
(p is the value (or estimate) of the random variable <l>. 


PW 



Figure 10.1 Probability density function of <j> 

We see here that the estimates 0 we would obtain can take a whole range of values 
but would predominantly take values near cr. It is often difficult to obtain the sampling 
distribution and so we shall settle for a few summarizing properties. 

Bias 

The bias of an estimator is defined as 

fe(4>) = £-[0] — 0 (10.2) 

i.e. the difference between the average of the estimator and the true value. Note that 
£[<!>] is a in the cases shown in Figure 10.1. Thus, the bias is a measure of the average 
offset of the estimator. If £>(<t>) = 0, then the estimator is ‘unbiased’. Although it seems 
desirable to use an unbiased estimator, we may need to allow some bias of the estimator 
if the variability of the estimate can be reduced (relative to that of an unbiased estimator). 

Variance 

The variance of an estimator is defined as 

Var(<t>) = £[(<!>- £[Oj) 2 ] = £[<i> 2 ] - E 2 [$] (10.3) 

This is a measure of the dispersion or spread of values of <l> about its own mean value 
(see Section 7.3). Note that the square root of the variance is the standard deviation <x( < t > ) 
of the estimator. In general, it is desirable to have a small variance, i.e. the probability 
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density function should be ‘peaky’. This requirement often results in an increase of the 
bias error. 


Mean Square Error 

The mean square error (mse) of an estimator is a measure of the spread of values of <t> 
about the true ( theoretical ) value (f>, i.e. 

mse(<f>) =£[(<!>- 0) 2 ] (10.4) 

Since £[(<§> - 0) 2 ] = £[(<J> - £[4>] + £[$] - <f > ) 2 ] = £[($ - £[4>]) 2 ] + £[(£[<&] - <f>) 2 ], 
the above equation can be rewritten as 

mse(<J>) = Var($) + fc 2 (<t) (10.5) 

which shows that the mean square error reflects both variance and bias. Thus, the mean 
square error is often used as a measure of the relative importance of bias and variance. 
For example, if an estimator has the property that its mean square error is less than any 
other estimators, it is said to be more efficient than other estimators. 

If the mean square error decreases as the sample size (amount of data) used to 
compute the estimate increases, then the estimator is consistent. Sometimes the errors 
are non-dimensionalized (normalized) by dividing them by the quantity being estimated 
(for (f> 0), e.g. as 


Bias 


bm 


error: Sb = 

V 


Random error: g r 


g(&) 

4> 


(10.6) 

(10.7) 


RMS error: e = 


V 7 mse(<t>) 

<P 


(10.8) 


Confidence Intervals 

The estimate <f> we have discussed so far is a point estimate, i.e. a single value. It is 
often desirable to define a certain interval of values in which the parameter is likely to 
fall. For example, if we estimate a mean value x as 50, then perhaps it is ‘likely’ that 
pt x lies in the interval 45 to 55. This estimate is an interval estimate, and is called the 
confidence interval when we attach a number describing the likelihood of the parameter 
falling within the interval. 

For example, if we say ‘a 95 % confidence interval for fi x is (45, 55)’, then this means 
we are 95 % confident that pt x lies in the range (45, 55). Note that this does not mean that 
the probability that p. x lies in the interval (45, 55) is 0.95, because fi x is not a random 
variable and so we cannot assign probabilities to it. Instead, we mean that if we could 
realize a large number of samples and find a confidence interval for pt x for each sample. 
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then approximately 95 % of these intervals would contain the true value pt x . In order to 
calculate confidence intervals, we need to know the sampling distribution of the estimator. 
We shall return to this problem when we discuss the spectral density function estimates. 

In the following sections, we shall give a summary of: (i) definitions of commonly used 
estimators; (ii) some statistical properties of the estimators; and (iii) some computational 
aspects for calculating the estimates. Unless otherwise stated, we shall assume we are 
dealing with realizations of a continuous, stationary random process. Also, if the data are 
sampled we assume that the sampling rate is sufficiently high so that there is no aliasing. 


10.2 MEAN VALUE AND MEAN SQUARE VALUE 

The Mean Value ofx(t) 

For a stationary stochastic process x(t ), the mean value (from data with length T) is estimated 
as 


4 / 


fix = — / x(t)dt 


(10.9) 


where the true mean value is p, x . Note that we have changed our notation for sample mean 
from x to fi x to use the circumflex notation for an estimate. The average of this estimate is 


l I 

E [fix] = — J E[x(t)]dt = — J p-xdt = I1 X 


(10.10) 


i.e. fi x is unbiased. Now consider the mean square error which is 


mse(Ax) = E [(fi x - fi x ) 2 ] = E 


bjf 


(x(tf> - Hx)(x(t 2 ) - p. x )dhdt 2 


0 0 


T T 


= Yl J J Cxx(t2 — t\)dt\dt 2 = 
0 0 


T T-t i 


4 // 


C xx (r)drdt\ 


( 10 . 11 ) 


0 — fi 


where r = t 2 — ti and C xx (r) is the autocovariance function. By reversing the integration 
order and changing the limits of integration appropriately, this equation can be written as 


U / 11 — 

ms e(fi x ) = J J C xx (x)dhdr + Jto J J 


C xx (z)dhdz 


-T -r 
T 


M 


1 - -j ) 


( 10 . 12 ) 
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The integrand is a triangular weighted covariance function, so the integral is finite. Thus 
mse(/t t ) — *■ 0 as T -*■ oo, i.e. this estimator is consistent. 

For example, if C xx ( r) = Ke~^ z \ then the power spectral density function is 

2X 

S xx {f) = K (assuming zero mean value) 

X 2 + (2 nf) 2 

and the 3 dB bandwidth is B = X/n Hz. The mean square error may be approximated as 

OO OO 

1 r \ C 2 K 9 K 

ms e(£,) — / C xx (x)dr = — / Ke~ m dx = — = - — (10.13) 

T J T J TX 7 xBT 

— OO — OO 

i.e. it is inversely proportional to the bandwidth-time ( BT) product of the data. 

To perform the calculation using digitized data (sampled at every A seconds), the mean 
value can be estimated by (as Equation (8.39)) 


j N - 1 

Ax= (io.i4) 

n=0 


The Mean Square Value ofx(t) 

As for the mean value, the mean square value is estimated as 

T 

fx = f J x 2 (t)dt 
0 


The mean of this estimate is 

T 

E K] = E[x\t)]dt = 
o 


(10.15) 


(10.16) 


where the true mean square value is i jr x . Thus, the estimator is unbiased. The variance of the 
estimate is 

Var(^) = E [(V x ~ flf] = E [(i } 2 f] - (tf ) 2 

T T 

= T/ j {E[x\h)x 2 m-^lf)dhdt 2 (10.17) 

0 0 


If we assume that x(t) is Gaussian we can use the following result to simplify the integrand. If 
the random variables X\, X 2 , X 3 and X 4 are jointly Gaussian, it can be shown that (Papoulis, 
1991) 


E[Xj X 2 X 3 X 4 ] = E[X 1 X 2 ]E[X 3 X 4 ] + E[Xi X 3 ]E[X 2 X 4 ] 

+E[X 1 X 4 ]E[X 2 X 3 ] - 2E[X l ]E[X 2 ]E[X 3 ]E[X 4 ] (10.18) 
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Using this result, it follows that E\x 2 (t\ )x 2 (Z 2 )] = tjr x + 2R\ x {t 2 — t\ ) — 2ii x . After some 
manipulation by letting r = t 2 — t\. Equation (10.17) becomes (Bendat and Piersol, 
2000 ) 


T 

Vat#* 2 ) = \ / (* - y) 00.19) 

-T 

Thus, for large T, if R xx ( r) dies out ‘quickly’ compared with T, then 

00 

Var(^ 2 ) « 2 J (R 2 xx { x) - n A x )dx ( 10 . 20 ) 

— OO 


For example, if fj, x = 0 and R xx (x) = Ke A| 0, then 


Var($ 0 ) 


2 K 1 
~TX 


2 K 2 
itBT 


( 10 . 21 ) 


Since \[r 2 is an unbiased estimator, mse(i Jr x ) = Var(i^ 2 ), which is also inversely proportional 
to the bandwidth-time product of the data. Note that the normalized rms error is 


y mse(i } 2 ) 

^Var(f 2 ) 

V 2 K 1 2 


tx 

Ks/nBT V it BT 


where 

T 

\[r 2 = — J R xx ( 0 )dt = K (from Equation (10.16)) 
0 


( 10 . 22 ) 


In practice, we often calculate the variance of the signal by subtracting the mean first 
from the data. In digital form, we might estimate the variance of x(nA) by 


j jv- 1 

VarW « a 2 = - Y, (Jc(nA) - £J 2 (10.23) 

n=0 

However, if the observations are independent the above can be shown to be a biased estimate. 
Thus, the divisor N — 1 is frequently used, i.e. the unbiased estimate is 

l N - 1 

d l = { J2 (x(nA) ~ ^ )2 

n = 0 


(10.24) 
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10.3 CORRELATION AND COVARIANCE FUNCTIONS 
The Autocorrelation (Autocovariance) Function 


If a signal x{t ) is defined for 0 < t < T, then there are two commonly used estimates 
of the theoretical autocovariance function C xx ( r). They may be written as C xx (r) and 
C xx { f), where 


r-| T | 

C xx (t:)=j J (x(t) - Ax)(*(f + M) - £*)<* 

o 

= 0 


0< |r| < T 
\r\>T 


and 


T-\t\ 

C xx (t) = — / (x(t) - p, x )(x(t + \r\) - f x )dt 

T- |t| J 

o 

= 0 


0 < |t| < T 
|T| > T 


(10.25) 


(10.26) 


The superscript b in Equation (10.25) denotes a biased estimate. Often the latter expression 
is used since it is unbiased. However, both these estimators are used because they have 
intuitive appeal and should be compared on the basis of some criterion (e.g. the mean 
square error) to choose between them. The estimates for the theoretical autocorrelation 
function R xx ( z) may be expressed as R xx (r) and R xx ( r) in the same way as above by 
omitting ji x in the equations. 


For convenience, suppose that the process x(t) has zero mean (so that C xx ( r) = R xx ( r)). 
Then, as in Jenkins and Watts (1968), we can calculate the bias and variance: 

1. Bias: Since 

r- |T| 

E [^w( r )] = j; J E U(t)x(t + |r | )\dt 0<\r\<T 
0 

the expected value of the biased estimator is 

E [*?„(*)] = «»(*) (l - y) 

= 0 

and the expected value of the unbiased estimator is 

E[R xx (t)] = R xx (t) 0<|r| <T 

— 0 | t | > T (10 ' 28) 

That is, for r in the range 0< |r| < T, R xx (r) is an unbiased estimator, whilst R xx (x) is 
biased but the bias is small when \r\/T 1 (i.e. asymptotically unbiased). 


0 < |t| < T 
|t| > T 


(10.27) 
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2. Variance: The variances of biased and unbiased estimators are 

T-x 

Var(^ A .(r)) = y j (T - r - \r\) {R 2 xx (r) + R xx (r + r)R xx {r - r))dr (10.29) 
~(T~ T) 

T-x 

Var(& XJ .(T)) = 2 J (T - t - \r\){Rl x (r) + R xx (r + z)R xx {r - rj)dr 

-(T-x) 

(10.30) 

When T is large compared with r, a useful approximation for both equations is 

OO 

Var(i?„(T)) « Var (R b xx { r)) « i J ( R 2 xx {r ) + R xx (r + r)R xx {r - r ))dr (10.31) 

— OO 

Note that the variance of the estimates is inversely proportional to the length of data, i.e. 
Var (R xx (t)) oc 1 JT . This shows that both estimators are consistent. Thus, the autocorre- 
lation function may be estimated with diminishing error as the length of the data increases 
(this also shows that the estimate of the autocorrelation function is ergodic). 


Comparison of the Two Estimators, R xx (t) and R xx (t) 

For r < f, there is little difference between the two estimators, i.e. R xx (r) sa R xx ( r). How- 
ever, Jenkins and Watts conjecture that mse(i( tx (r)) > mse(J? xr (r)). In fact, by considering 
the divisor in Equation (10.30), it is easy to see that as r — >■ T the variance of the unbiased 
estimator R xx (r) tends to infinity (i.e. diverges). It is this behaviour that makes the unbiased 
estimator unsatisfactory. However, we note that the unbiased estimator is often used in prac- 
tical engineering despite the relatively larger mean square error. As a rough guide to using 
R xx ( r )< the ratio of the maximum lag to the total data length, r max /r, should not exceed 0.1. 

Another important feature of the estimators is that adjacent autocorrelation function 
estimates will have (in general) strong correlations, and so the sample autocorrelation function 
R xx (r) (and R xx ( r)) gives more strongly correlated results than the original time series x(t), 
i.e. the estimate may not decay as rapidly as might be expected to (Jenkins and Watts, 1968). 

The Cross-correlation (Cross-covariance) Function 

If x(t) and y (f) are random signals defined for 0 < t < T, the sample cross-covariance 
function is defined as 

T-x 

C xy ( t) = — J (x(t) - A x )(y(t + t) - il y )dt 0 < t < T 

j ° l (10.32) 

= Y^]r\ J ' x(t) ~ + T )~ ^y> dt ~ T <r -° 

= 0 


|r| > T 
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This is an unbiased estimator. The same integral but with divisor T can also be used, which 
is then the biased estimator C xy { r). Similarly the sample cross-correlation function may be 
expressed as R xy ( r) or R xy (r) without subtracting mean values. 

The estimators for the cross-correlation (cross-covariance) function have statistical prop- 
erties that are very similar to those of the autocorrelation (autocovariance) function: R xy ( r) 
is unbiased whilst R xy (r) is asymptotically unbiased, and R b ( r) has a smaller mean square 
error, i.e. mse(R i3 ,(r)) > mse(^,(r)). 

Methods of Calculation Using Sampled Data 

From sampled data, the autocovariance and cross-covariance functions are evaluated from 

1 N — m — 1 

C xx (mA) = — (x(nA) - /X x )(x((n + m) A) - £*) 0 < m < N - 1 

N — m “ 

n= 0 

(10.33) 

and 

j N—m — l 

C xy (m A) — Y (x(n A) — p, x )(y((n + m)A) — fl v ) 0 < m < N — 1 

N — m 

n= 0 

(10.34) 

where both x(n A) and y(nA) are N-point sequences, i.e. they are defined for n — 
0, l, . . . , N — l, and m is the lag that may take values 0 < m < N — 1. Note that these are 
unbiased estimators, and the divisor is N for the biased estimators C xx (m A) and C xy (m A). 
The same expressions are applied for the computation of autocorrelation and cross-correlation 
functions, e.g. R xx (m A) and R xy (m A) are obtained without subtracting mean values in Equa- 
tions (10.33)" and (10.34). 

We note that the autocorrelation function is even, i.e. R xx {—m A) = R xx (m A), and the 
cross-correlation function has a property that 

j N —in — l 

R xy (—niA) = R yx (mA) = y' y(nA)x((n + m) A) 0 < m < N — 1 

N — m ‘—i 

n = 0 

(10.35) 

The above expressions are the so-called ‘mean lagged product’ formulae and are evaluated 
directly if there are not too many multiply and add operations. However, it turns out that it is 
quicker to use FFT methods to evaluate these indirectly. 

Autocorrelation via FFT 1 

The basis of this method lies in our earlier discussions in Chapter 6 on the convolution of two 
sequences (i.e. the convolution sum). Recall that if y{n) = h(m)x(n — m), and h(n) 

and x(n) are N -point sequences, then y(n) is a (2N — l)-point sequence. Since the DFT of 
the convolution of two sequences is the product of the DFTs of two sequences, as long as the 
sequences are padded out with zeros to avoid circular convolution, the sequence y(n) can be 
obtained by y(n) = IDFT [H(k)X(k)]. 


See Bendat and Piersol (2000) for more details. 
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The correlation calculation is very similar. In fact Y^n= o x ( n )y( n + m ) * s a convolution 
of x(—n) with y(n) and the DFT of this is 


DFT 


x(n)y(n + m) 


= ££ x(n)y(n + ra)r J W* ) "‘ = X*(k)Y(k) (10.36) 


Thus, the required correlation function R xy (m A) is the IDFT [X*(k)T(k)] and then scaled by 
l/(iV — m ). Note that we must ensure that circular effects are removed by adding zeros. 

Pictorially, the computation of the autocorrelation function can be illustrated as in 
Figure 10.2. Note that the sequence is effectively periodic when the DFT is used. 


x(n) 


Zeros are padded 

/ f 


0 


N-\ 


IN 


x(n + m ) 


Figure 10.2 Pictorial description of the computation of the autocorrelation function 


We can see that the correlation of these periodic sequences is the same as the linear 
correlation if there are as many zeros appended as data points. So the autocorrelation function 
(without explicitly noting the sampling interval A) 


R xx (m) = 


1 


N — m 


N—m—l 

y x(n)x(n + m) 

n—0 


is obtained by: 


1 . Take x{n) ( N points) and add N zeros to it. 

2. Form X(k) (2N -point DFT). 

3. Form IDFT [X*(fc)X(A:)] and then scale appropriately by l/(N — m). 

The result will have the appearance as shown in Figure 10.3. This basic idea can be generalized 
to cross-correlation functions. 


RJ.m) 



Figure 10.3 Results of autocorrelation computation using the DFT 
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10.4 POWER SPECTRAL DENSITY FUNCTION 

There are two main approaches to estimating the power spectral density function, namely 
parametric (more recent) and non-parametric ( traditional ), as shown in Figure 10.4. 

Estimation methods 


Non-parametric Parametric 

(ARMA, maximum entropy, etc.) 


Filter bank method Indirect method Direct methods 

(analogue method) (Fourier transform of (e.g. segment averaging) 
autocorrelation function) 

Figure 10.4 Classification of the estimation methods for the power spectral density function 

Estimation methods for the power spectral density function considered here will 
relate to the ‘traditional’ methods rather than ‘parametric’ methods. We shall outline 
three methods for the estimation of the power spectral density function: 

• Method (1): ‘Analogue’ method (filter bank method) 

• Method (2): Fourier transform of the autocorrelation function (indirect method) 

• Method (3): Direct methods. 

Note that Method (3) is the most widely used since with the advent of the FFT it is the 
quickest. 


Method (1): ‘Analogue’ Method (Filter Bank Method) 

The word ‘analogue’ is in quotes because this method can also be implemented digitally, but 
it is convenient to refer to continuous signals. The basic scheme is indicated in Figure 10.5. 

The basis of this method is that the variance of the signal is the area under the power 
spectral density function curve (assuming zero mean value), i.e. 

OO 

Var (*(0) = <r, 2 = f G xx (f)df (10.37) 

o 

where G xx (f ) is regarded as a measure of the distribution of the power of the process over 
frequency. So if we wish to know the power in the signal over some frequency band f c ± S/2, 


x(t) 

Tunable narrow 

X(tj c , B) 

Squarer, 

GxAfc) 


band-pass filter 


integrator and averager 



Centre frequency, f c 
Bandwidth, B 


Figure 10.5 Concept of the filter bank method 
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then we pass the signal through a filter with that passband, square (to get the power), average 
to reduce the fluctuations and divide by the bandwidth to obtain the ‘density’, i.e. 

T 

GxAfc ) =Jff * 2 ('. fc B )dt (10-38) 

o 

The key elements in any spectral estimation scheme are: (i) a procedure to ‘home in’ on 
a narrow band, i.e. good resolution (low bias); (ii) the subsequent smoothing of the squared 
estimate (i.e. low variance). 

Let us assume an ideal band-pass filter and discuss the bias and variance of this estimate. 
The frequency response function of an ideal band-pass filter is shown in Figure 10.6. 

H(f) 

B 

i 1 1-0- ■ 

ii ii 

ii ii 


Figure 10.6 Frequency response function of an ideal band-pass filter 


Bias 

The bias of the smoothed estimator is obtained from averaging G xx (f c ), i.e. 

T 


' [G xx (f c )} = ^fj E fc, B)]dt = 


E [j c 2 (t, f c , B)\ 


B 


(10.39) 


Note that E [x 2 (f, f c , B)] is the variance of the output of the filter, i.e. 


fc+B/2 


[A-. /„*>]= / e.AfW 


Thus, 


fc-B/2 


fc+B/2 


' [ GxAfc )] = ^ f G x Af)df 


fc-B/2 


(10.40) 


So, in general, E[G x Af c )\ 7^ G xx (f c ), i.e. the estimate is biased. Expanding G xx (f) in a 
Taylor series about the point / = f c gives 

G x Af) * GxAfc) + (/ - fc)G' xx {fc) + (/ ~/ c) G'' x (fc) 


2! 


(10.41) 
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Substituting this into the integral, we get (Bendat and Piersol, 2000) 

E [< G xx {fc )] « G xx (fc) + 24 C «(^) (10.42) 

bias 


Note that at a peak, G xx (f) < 0, so the power spectral density is underestimated (on 
average); at a trough, G" xx (f) > 0, so we have an overestimate, i.e. the dynamic range is 
reduced as illustrated in Figure 10.7. Note also that poor resolution (large B) introduces 
more bias error. 



Figure 10.7 Demonstration of the bias error of G„(/) 


As can be seen from Equation (10.42), bias depends on the resolution B (i.e. the filter 
bandwidth) relative to the fine structure of the spectrum. As an example, consider a simple 
oscillator (with a damping ratio of f and a resonance at/ r ) excited by white noise. The output 
power spectral density may be as shown in Figure 10.8, where the half-power point bandwidth 
is given by B, fs 2 £/ r . For a given ideal filter with a bandwidth of B centred at frequency f r , 
the normalized bias error at f r can be shown to be (Bendat and Piersol, 2000) 


b(G xx (f r )) ~ 1/B\ 2 

G xx {fr) ** 3 UJ 


(10.43) 


Note that if B — B r , the normalized bias error is —33.3%, whereas the bias error may be 
negligible if B < B r /4 (where |e/,| < 2.1 %). 


GJ,f r \ E[GJJ r j\ 


dB 


i 


- 3 dB bandwidth B r ~ 2 £f r 
Filter bandwidth B 


Figure 10.8 Illustration of 3 dB bandwidth and the filter bandwidth 
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Variance and the Mean Square Error 


Assuming that the process is Gaussian and G xx (f) is constant over the bandwidth of 
the filter, the variance of the estimate may be shown to be (Newland, 1984; Bendat and 
Piersol, 2000) 

Var(G XI (/)) « (10.44) 

The mean square error is the sum of the variance and the square of bias, and we 
normalize this to give 


Var {G xx (f))+b 2 ( G xx (f )) (G]JJ)\ 2 

G 2 xx (f ) ~ BT + 576 \G xx (f)) 


(10.45) 


Note the conflict - to suppress bias the filter bandwidth B must be small (i.e. fine res- 
olution), but to reduce the variance the product BT must be large. Note also that the 
product BT relates to controllable parameters, i.e. B is the filter bandwidth (not the data 
bandwidth), and the averaging time T obviously affects the variance. While maintaining 
small filter bandwidth, the only way to reduce the mean square error is by increasing the 
averaging time T . 


Comments on the Choice of Filter Bandwidth 2 

The basic choice is between the constant (absolute) bandwidth and the constant ( relative ) 
percentage (%) bandwidth. The constant bandwidth gives uniform resolution on a linear 
frequency scale, as shown in Figure 10.9. 



Figure 10.9 Constant bandwidth (10 Hz) filter 
For constant bandwidth, the centre frequency of an ideal filter is defined as 

fc = (10.46) 

where /„ and /; are defined as in Figure 10.10. The centre frequency is simply the arithmetic 
mean of the upper and the lower frequencies. 

Constant bandwidth is useful if the signal has harmonically related components, i.e. 
for detecting a harmonic pattern. However, note that if the bandwidth is satisfactory at high 
frequencies, it is ‘coarse’ at the one below and swamps the next lowest on a logarithmic scale 
(see Figure 10.1 1). 


See Randall (1987) for more details. 
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log f log /„ 

Figure 10.13 Comparison of the third octave filter and the octave filter 


(lower frequency for a centre frequency of 31.5 Hz) to 22.5 kHz (upper frequency for a centre 
frequency of 16 kHz). 

Third octave filters (1/3 octave filters) are obtained as illustrated in Figure 10.13, i.e. each 
octave band is divided into three geometrically equal subsections. 

As shown in the figure, log (/„///) = log 2 for the octave filter, so log (/„//;) = I log 2 
for the third octave filter, i.e. /„ = 2^ 3 //. (Note that this is approximately 1/10 of a decade, 

i.e. i log 2 ss 0.1 = ^ log 10.) The centre frequency is f c = Jf u ■ /, = ^V 3 /, 2 = 2 1 / 6 /,, 
and the bandwidth for the third octave filter is 

Bandwidth = /„ - /, = (2 1/3 - 1)/, (10.50) 

and the relative bandwidth is 


Relative bandwidth = 


fu ~ fi 

fc 


2 1 / 3 - 1 
2 1 / 6 


23.1% 


(10.51) 


Note that, similar to the third octave filter, an m octave filter may be defined so that /„ = 2 1/1 m // . 

The above considerations relate to ‘ideal' filters. Various other definitions of bandwidth 
exist, e.g. 3 dB bandwidth and noise bandwidth as shown in Figure 10.14. As mentioned in 
Section 4.6, the (effective) noise bandwidth is defined as the width of an ideal rectangular filter 
that would accumulate the same noise power from a white noise source as the practical filter 
with the same reference transmission level. The 3 dB bandwidth is the width of the practical 
filter at the 3 dB points. Although the noise bandwidth and the 3 dB bandwidth are close to 
each other, the 3 dB bandwidth may be more useful when describing structural responses and 



Figure 10.14 Noise bandwidth and 3 dB bandwidth of a practical filter 
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is often preferably used since it is easier to measure. Later in this section, we shall define 
another bandwidth for spectral windows, which is effectively the resolution bandwidth. 

As a final comment, note that the phase characteristics of the filters are unimportant for 
power spectra measurements. 


Method (2): Fourier Transform of the Autocorrelation Function 
(Indirect Method) 3 

Consider a sample record x(t), where \ t \ < T/2, and the corresponding Fourier transform 
given by X T {f) = f^ 2 x(t)e~j 27r f l dt. Then, the raw (or sample) power spectral density 
function is 

T/2 T/2 

$»(/) - lXA T f)l = y J / x(t)e~ i2nf, x(h)e j2nft 'dtdt i (10.52) 

-T/2 -T/2 

Now, transforming the double integral by setting u = t — t\ and v = t\, as shown in 
Figure 10.15, then Equation (10.52) may be rewritten as 


T 

Sxx(f) = f 
0 


T/2-u 

J x(u + v)x(v)dv 

-T/2 


-j2nfu 


du 


+ 


0 T/2 

Iff 


x{u + v)x(v)dv 


-j2nfu 


du 


-T/2-u 


(10.53) 


By definition, the term in the first square bracket is R xx (u) for 0 £ u < T, and the term in the 
second square bracket is R xx (u ) for — T < u < 0, i.e. S xx (f ) = f T r R xx (u)e~i 2 *f u du. 


U 



See Jenkins and Watts (1968) for more details. 
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Thus, an estimate of the power spectral density from a length of data T (we assume that 
x{t) is defined for \ t\ < T /2) can be written as 


Sxx(f) = f R b xx (T)e-W'dr 


(10.54) 


Note that the sample power spectral density S xx (f) is related to the sample autocorrelation 
function R xx ( t) (biased estimator with divisor T). This relationship suggests that we might 
estimate the power spectral density by first forming the sample autocorrelation function and 
Fourier transforming this. (However, we do not presume the validity of the Wiener-Khinchin 
theorem - which will follow shortly.) Note that R xx (t) = 0 for | r| > T, thus 


OO 

$«,(/)= f R b xx {r)e-^ dr = F{R b xx (z)} 

— OO 


(10.55) 


However, as mentioned in Chapter 8, this is termed the ‘raw’ power spectral density since 
it turns out that the variability of this estimator is independent of the data length T as we shall 
see soon. Now, first consider the bias of this estimator. 


Bias ofS X x(f) 

Averaging S xx (f) gives 

T 

E[S xx (f)] = j E [R b xx (r)] e ^dz (10.56) 

-T 

and using Equation (10.27), this becomes 

T 

E [£„(/)] = J R xx ( r) ( 1 - y) e-W'dr 0 < |t| <T (10.57) 

-T 

Thus, if T is sufficiently large, i.e. T — > oo, then 

r „ , E\\X T (f)\ 2 ] 7 , 

lim E [S„(/)] = lim 1 J 1 = S XX (J) = / K.„(t)r jll/ Vr (10.58) 

T — >■ oo T—>oo l J 

— oo 

So S xx (f) is an asymptotically unbiased estimator. The above result (10.58) proves the 
Wiener-Khinchin theorem introduced in Chapter 8. 


Variance of S xx (f) 

As a prerequisite we need to discuss the properties of a random variable having the so-called 
chi-squared distribution. We first outline some important results on this (Jenkins and Watts, 
1968). 
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The Chi-squared Distribution 

Let Xi, X 2 , . ■ ■ , X„ be n independent random variables, each of which has a normal distribu- 
tion with zero mean and unit standard deviation , and define a new random variable 

xt = X 2 + X 2 + ■ ■ ■ + X 2 n (10.59) 

The distribution of x 2 is called the chi-squared distribution with ‘n degrees of freedom’ , where 
the number of degrees of freedom represents the number of independent random variables X , . 
The general form of the x 2 probability density function with v degrees of freedom is 

p 2 (x) = — ^ x (v/2) 'e~ x/2 0 < x < 00 (10.60) 

yx " 2 v / 2 r(v/2) “ “ 

where T(^) = / 0 °° e - 'f (v / 2) ~ 1 <if is the gamma function. For some values of v , p x i(x) are 
shown in Figure 10.16. 


P r 2OO 

Xv 



Figure 10.16 Chi-squared probability density functions 

For a small value of v, the distribution is non-symmetrical, but as v increases the 

chi-squared distribution tends to Gaussian, as predicted by the central limit theorem. The 

first two moments of the x 2 random variable are 

E [ X 2 ] = v (10.61) 

Var(x 2 ) = 2v (10.62) 

We now summarize two important properties of the chi-squared distribution: (i) the 
decomposition theorem for chi-squared random variables; (ii) approximation by a chi-squared 
distribution. The first property states that if a random variable x 2 is decomposed into k random 

variables according to x 2 = X 2 , + X 2 2 H F xl k and if v i + v 2 h v k = v, then the 

random variables x 2 are mutually independent. Conversely, if k independent random variables 
X 2 are added together, then the sum is x 2 , where 

v = Vi + v 2 -t \-v k (10.63) 

The second property states that: suppose we have a positive-valued random variable 
F, and we wish to approximate its distribution by ax 2 where a and i> are unknown, but 
we may know the mean and variance of Y, i.e. p y and a 2 are known. Then in this case, 
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E [F] = pi y = E [n X 2 ] = av and a 2 = Var(nx 2 ) — u 2 Var (x 2 ) = 2a 2 v = 2 p 2 /v. Thus a and 
v are found from 


Py 

(10.64) 

V 


2Py 


°y 

(10.65) 


Variance Considerations 

We now return to study the variance of S xx {f )• The sample power spectral density function 
can be written as (x(t) is defined for 1 1 \ < T /2) 



T/2 

2 

OO 

\x T (f)\ 2 1 

S x Af) = ^ ' = - 


1 

“ T 



-T/2 


-OO 



OO 

2 

OO 

2' 


J x(t)cos(2jcft)dt 

+ 

J x(t) sin(2 nft)dt 



^-OO 


^-OO 



= ^[X 2 c (f) + X 2 (f)] (10.66) 

where X c (f) and X s {f ) are Fourier cosine and sine transforms of x(t). 

Let us assume that x(t) is a Gaussian process with zero mean value; then X c (f) 
and X s (/) are also Gaussian and have zero mean values. Furthermore, it can be shown 
that X c (f) and X s (f ) are uncorrelated and have approximately equal variances (Jenkins 
and Watts, 1968). Now if the variances were unity, we could use the properties of the 
chi-squared distribution to say S xx (f) is related to a x 2 distribution (note that S xx (f) is 
a squared quantity and so positive valued). We do this as follows. Let 

J E = j: E = ff2 ( sa y> for each frequency /) (10.67) 


and 


E [$„(/)] ~ S xx (f) = 2 a 2 


Then, it can be shown that 


2 S xx (f) X 2 (f) X 2 (f) 

S xx (f) To 2 + To 2 


(10.68) 


(10.69) 


which is the sum of two squared Gaussian random variables with unit variances (note that 
X c (f ) and X s (f) are jointly normally distributed (see Appendix G for justification) and 
uncorrelated, so they are independent). Therefore the random variable 2 S xx (f)/ S xx (f) 
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is distributed as a chi-squared random variable with two degrees of freedom, i.e. xf (for 
all values of sample length T). Using Equation (10.62), the variance is 


and so 


Var 


/ 2 S xx (f) \ 
\ S xx {f) J 



4Var ($„(/)) 

SUf) 


(10.70) 



Var ($„(/)) = S 2 xx (f) 


(10.71) 


or 


<r(S xx (f)) = S xx (f) (10.72) 

This important result states that the estimator S xx (f ) has a variance that is independent 
of sample length T, i.e. S xx (f ) is an inconsistent estimate of S xx (f). Furthermore, the 
random error of the estimate is substantial, i.e. the standard deviation of the estimate is 
as great as the quantity being estimated. These undesirable features lead to the estimate 
S xx (f) being referred to as the ‘raw’ spectrum estimate or ‘raw periodogram’. As it 
stands, S xx (f) is not a useful estimator and we must reduce the random error. This may 
be accomplished by ‘smoothing’ as indicated below. However, as we shall see, the penalty 
for this is the degradation of accuracy due to bias error. 


Smoothed Spectral Density Estimators 

As we have already discussed, the averaged S xx {f) is given by Equation (10.57), where 
the integrand is R xx ( r) (1 — \r\/T). This motivates us to study the effect of introducing a 
lag window, w(r), i.e. the estimate may be smoothed in the frequency domain by Fourier 
transforming the product of the autocorrelation function estimate and w(r) which has a 
Fourier transform W(f) (called a ‘spectral window’). This defines the smoothed spectral 
estimator S xx (f) as 

OO 

s xx (f)= f R b xx (T)w(T)e- i2 * fz dT (10.73) 

— OO 

Recall that 

OO OO 

f x{t)w(t)e- i27lf, dt= I X(g)W(f — g)dg 

— OO — OO 
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i.e. S xx (f) = S xx (f) * W(f), which is the convolution of the raw spectral density with 
a spectral window. Thus the right hand side of Equation (10.73) has an alternative form 

OO 

S xx (f) = I S xx {g)W(f - g)dg (10.74) 

— OO 

This estimation procedure may be viewed as a smoothing operation in the frequency 
domain. Thus the lag window w(r) results in the spectral window W(f) ‘smoothing’ the 
raw periodogram S xx (f) through the convolution operation. In fact, the above method is 
the basis of the correlation method of estimating spectral density functions. In the time 
domain, the lag window can be regarded as reducing the ‘importance’ of values of R xx ( r) 
as t increases. 


It is necessary to start all over again and study the bias and variance properties of this 
new estimator S xx (f ) where clearly the window function w(r) will now play an important 
role. Jenkins and Watts (1968) and Priestley (1981) give a detailed discussion of this problem. 
We shall only quote the main results here. 

Some commonly used window functions are listed in Table 10. 1 . The rectangular window 
is included for completeness, and other window functions may also be used. Note that the 
discussions on window functions given in Chapter 4 are directly related to this case. 

The lag windows w(r) are shown in Figure 10.17, where u>(0) = 1 for all windows, 
and the spectral windows W(f) are shown in Figure 10.18. Note that the spectral win- 
dows which take negative values might give rise to negative spectral density estimates in this 
approach. 


Table 10.1 Commonly used lag and spectral windows 


Window 

name 

Lag window, w{z) 


Spectral window, W(f) 

Rectangular 

w{r) = 1 
= 0 

|r| 5 T„ 
|r| > T w 


Bartlett 

, . . hi 

w(r) = 1 

T w 

= 0 

|r| 2 T w 
|r| > T w 


Hann(ing) 

1 / TtT \ 

^ 1 1 + cos jr ) 
= 0 

|r| 5 T w 
|r| > T w 

/ sm(27tfT w )\ 1 

W(f) = T w [ J ) 

w \ 2nfT w ) 1 - (2 fT w f 

Hamming 

u>( r) = 0.54 + 0.46 cos — 
T w 

= 0 

M 2 T w 
M > T w 

[0.547T 2 — 0.08(27^7^ ) 2 ] sm{27tfT w ) 
W{f) 2nfT w [^-(2nfT^] 

Parzen 

"("£)■ T 

= 0 

T w 

M<f 

< |r| 2 T w 
|r| > T w 

J 4 V nfT w /2 J 
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w(z-) 



Figure 10.17 Lag windows (for r >0) 



Bias Considerations 

From Equation (10.74), the average of the smoothed power spectral density function is 

OO 

E[S xx (f)]= I S xx (g)W(f - g)dg (10.75) 

— OO 

So, S xx (f) is biased. Note that this equation indicates how the estimate may be distorted 
by the smearing and leakage effect of the spectral windows. In fact, for large T, bias is 
shown to be 

OO 

b (S X AD) = E [S„(/)] - S xx (f) « J [w(r) - 1] R xx {x)e-^dr (10.76) 

— OO 

Note that the bias is different for each lag window. The details are given in Jenkins and 
Watts (1968). We may comment broadly that the general effect is to reduce the dynamic 


340 


ESTIMATION METHODS AND STATISTICAL CONSIDERATIONS 


range of the spectra as in the filter bank method of the spectral estimation, i.e. peaks are 
underestimated and troughs are overestimated. 

As can be seen from Equation (10.75), the bias is reduced as the spectral window 
gets narrower, i.e. as the width of the lag window w(r) gets larger, the spectral window 
becomes W(f) — *■ <5(/).So, S xx (f) is asymptotically unbiased (for T w -*■ oo). However, 
the spectral windows cannot be made too narrow since then there is little smoothing and 
the random errors increase, so we need W(f) to have some width to do the smoothing. 
Consequently, once again we need a compromise between bias and variance, i.e. a trade- 
off between the resolution and the random error. Note that, sometimes, the bias problem 
is referred to under ‘bandwidth considerations’ since small bias is associated with small 
bandwidth of the window function. 


Variance Considerations 

From Equation ( 10.74), S xx (f) can be considered as a weighted sum of values of S xx (f). 
Thus, it may be argued that nS xx (f)/ S xx (f) is approximately distributed as a chi-squared 
random variable with n degrees of freedom, where the number of degrees of freedom 
is defined as 


n 


2 / 

I-C c w 2 {r)dx 


2T 

~T 


(10.77) 


which depends on the window and data length (Jenkins and Watts, 1968). 
Also, since Vai(nS xx (f)/S xx (f)) — 2 n, it can be shown that 


Var (S„(/)) = 


&(/) 

n/2 


s 2 xx (f ) 

T/I 


(10.78) 


s xx m s xx (n 

o{S xx {f)) = (10.79) 

^/K/2 JTJl 

Now, l/I can be argued to be a measure of the resolution bandwidth B of the window 
(see below for justification), so the number of degrees of freedom is n — 2BT. Thus, the 
above equations can be rewritten as 


Var(5xx(/)) = J_ 

s 2 xx {f) bt 


(10.80) 


°(S xx (f)) = 1 

s xx (f) \/~bt 


(10.81) 


To justify that 1 // is a measure of bandwidth, consider an ideal filter shown in Figure 10.19, 
where 


OO oc 

/= f W 2 (r)dr= f W 2 (f)df 


(10.82) 
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w(f) 



1/S 



s 






S/2 


Figure 10.19 An ideal filter with a resolution bandwidth of B 


Since 


f w 2 (/w/=i 


it can be shown that 

l/I = ' = B (10.83) 

/ w 2 (r)dr f W 2 (f)df 

— oo — oo 

When the ‘non-ideal' filters (Bartlett, Hann, Parzen, etc.) are used, then 1 /f°° w 2 (r)dT 
defines a generalized bandwidth. This is the effective resolution of the spectral window. 

A summary of bandwidths, biases and variance ratios of some window functions is given 
in Table 10.2 (Jenkins and Watts, 1968; Priestley, 1981). 


Table 10.2 Properties of some spectral windows 


Window 

Bandwidth 
1// = B 

DOF 

. Var (S„(/)) 

Approximate bias 

name 

’ S 2 x (f) 


0.5 

T 

2T W 


Rectangular 


Tw 

T 

N/A 

Barlett 

1.5 

3 T 

OMITw 

— f — \r\R xx (x)e^ i2 " fT dT 


T w 

T 

1 w 

T 

I W J 

Hann(ing) 

1.333 

2.667T 

0J5T W 

0.063 „ 

Tw 

T 

1 w 

T 

1 w 

Hamming 

1.26 

2.52 T 

0.795 

0.058 „ 

« „ S"(f) 


T w 

T 

1 w 

T 

1 w 

Parzen 

1.86 

3.71 T 

0.539 T w 

0.152 „ 

« „ S" (/) 


Tw 

T w 

T 

1 w 


Note: S" x (f) is the second derivative of the spectrum at frequency /, T is the total data length and T w is as defined in Table 10.1 
(i.e. half of the lag window length). 
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General Comments on Window Functions and S xx (f ) 

Although the rectangular window is included in Table 10.2, it is rarely used since its spectral 
window side lobes cause large ‘leakage’ effects. However, in Chapter 8, the rectangular window 
function is applied to estimate the spectral density functions in MATLAB Examples 8.8-8.10 
(note that we have used a very long data length T). 

Bias of the Bartlett window is of order 1 / T w and so will in general be greater than other 
windows (order 1 /T 2 ). Among the Hann, Hamming and Parzen windows, the Parzen window 
has the smallest variance but has the greatest bias. Since the bias of these windows depends on 
S'/xif h larger bias occurs at sharp peaks and troughs than other frequency regions. From the 
table, we can easily see that the bias is reduced as T w increases, but the variance is increased. 
The variance can then be reduced by increasing the total data length T. 

We note that: (i) when the bias is small, S xx (f) is said to reproduce S xx {f) with high 
fidelity, (ii) when the variance is small, the estimator is said to have high stability (Jenkins and 
Watts, 1968). The choice of window functions should depend on whether the concern is for 
statistical stability (low variance) or high fidelity (small bias), although in general we must 
consider both. For example, if the spectral density function has narrow peaks of importance 
we may willingly tolerate some loss of stability to resolve the peak properly, while if the 
spectral density function is smooth then bias errors are not likely to be so important. Thus, we 
may state that the estimator S xx (f ) is approximately unbiased and has a low variance only if 
the sufficiently narrow resolution bandwidth and yet long enough data are used. 

When spectra are estimated via the autocorrelation function many texts give only ap- 
proximate values for the degrees of freedom and the resolution bandwidth. For example, with 
digital analysis with N data points (sampled at every A seconds) and a maximum correlation 
lag M, the number of degrees of freedom is usually quoted as 2 N/M and the resolution 
bandwidth as 1/(MA) (this may corresponds to the rectangular window function). 

Finally, instead of multiplying the sample autocorrelation function by the lag window and 
Fourier transforming the weighted sample autocorrelation function, an alternative procedure 
is to do the smoothing in the frequency domain, i.e. form the raw periodogram and perform 
the frequency convolution. This ‘frequency smoothing’ will be referred to again soon. 


Method (3): Direct Methods 

We shall now discuss the basis for forming smoothed power spectral density estimates without 
first forming the autocorrelation functions. These methods are probably the most widely 
used because of computational considerations. There are two methods, although they can be 
combined if required, namely (i) segment averaging; (ii) frequency smoothing. 


Segment Averaging (Welch’s Method )* 110 1 

The segment averaging method has become very popular mainly because of its fast 
speed of computation. The method is discussed in Jenkins and Watts (1968) as Bartlett’s 
smoothing procedure and in Welch (1967) in some detail. The basic procedure is outlined 
with reference to Figure 10.20. 


POWER SPECTRAL DENSITY FUNCTION 


343 



Consider the data (length T) which is segmented into q separate time slices each 
of length T r such that qT r = T (in this case non-overlapping). Now we form the raw 
periodogram for each slice as 


S XXi (f) = -i \X Trl (f)\ 2 for i = 1,2 q (10.84) 

We saw earlier that this is distributed as a chi-squared random variable with two degrees 
of freedom. We might expect that by averaging successive raw spectra the underlying 
behaviour would reinforce and the variability would reduce, i.e. form 

1 q 

S xx (f) =-Y S xx ,(f) (10.85) 

« fer 

We can estimate the variance reduction by the following argument. Note that for each 
segment 2 S XXl (f)/ S xx (f) is a xf random variable. From Equation (10.85), 

2 S xx (f)-q 2S XXl if) 

SxAf ) U s xx (f ) 

Thus, (2 S xx (f) ■ q)/S xx (f) is the sum of qx 2 random variables and so assuming that 
these are essentially independent of each other then this is approximated as Xiq- Fr° m 
Equation (10.61), 


E 


" 2S xx (f ) ■ q ~ 
. S xx (f ) _ 


2 q 


( 10 . 86 ) 


from which E[S xx {f)] «=* S xx (f) (i.e. S xx (f) is approximately unbiased). From Equation 
(10.62), 


Var 


( 2S xx (f)-g \ 

V $,*(/) ) 


4 q 


(10.87) 


Thus, 


4 q 2 

SLl 7 ) 


Var (S xx (f)) ~ Aq 


344 


ESTIMATION METHODS AND STATISTICAL CONSIDERATIONS 


i.e. it follows that 

Var (.§«(/)) _ 1 

SUf) * q 

<r ( S xx (f )) 1 

S xx (f) ^ Jq 


(10.88) 

(10.89) 


This can be expressed differently. For example, the resolution bandwidth of the 
rectangular data window is B = \/T r = q/T . Thus, Equation (10.88) can be written as 


Var (.?„(/)) _ 1 

SUf) ^ BT 


(10.90) 


which is the same as Equation (10.80). Note that by segmenting the data, the resolution 
bandwidth becomes wider since T r < T. We must be aware that the underlying assump- 
tion of the above results is that each segment of the data must be independent (see the 
comments in MATLAB Example 10.1). Clearly this is generally not the case, particularly 
if the segments in the segment averaging overlap. This is commented on in the following 
paragraphs. 

To summarize the segment averaging method: 

1. Resolution bandwidth: B ^ jr = j 

2. Degrees of freedom: n = 2q = 2 BT 

3. Variance ratio: ex -2- — l 


While the above description summarizes the essential features of the method, Welch 
(1967) and Bingham et al. (1967) give more elaborate procedures and insight. Since the use 
of a rectangular data window introduces leakage, the basic method above is usually modified 
by using other data windows. This is often called linear tapering. The word ‘linear’ here does 
not refer to the window shape but to the fact that it operates on the data directly and not on the 
autocorrelation function (see Equation (10.73)) where it is sometimes called quadratic tapering. 

The use of a data window on a segment before transforming reduces leakage. Elowever, 
since the windows have tapering ends, the values obtained for S XXl (f) must be compensated for 
the ‘power reduction’ introduced by the window. This results in the calculation of ‘modified’ 
periodograms for each segment of the form 


S xx ,(f) = 


1 

T r 


2 


/ 

i th interval 


x{t)w(t)e 


T r /2 

Y j ui 2 (t)dt 

-T r /2 


(10.91) 


where the denominator compensates for the power reduction, and is unity for the rectangular 
window and 3/8 for the Elann window. 




POWER SPECTRAL DENSITY FUNCTION 


345 


Finally, we note that the use of a data window ignores some data because of the tapered 
shape of the window. Intuitively the overlapping of segments compensates for this in some 
way (Welch, 1967), though it is not easy to relate these results to the indirect method of Fourier 
transforming the autocorrelation function. We must remember that ideally segments should 
be independent to obtain the variance reduction - and with overlapping this is compromised. 
We simply quote the following results from Welch (1967) which is based on Gaussian white 
noise. If the segments overlap by one-half of their length (50 % overlap), and the total data 
length is N points and each individual segment length is L points, then 


1. The number of degrees of freedom is n rst 2(^ — 1) 

2. The variance ratio is 

3. The resolution is 1/(LA) = f s /L, but note that this depends on the data window being 
used. 


Frequency Smoothing 

This approach is based on the comments given in Method (2) (in the last paragraph). The 
method is as follows: 

1. Form the raw periodogram from the data length T. 

2. Average Z neighbouring estimates of this spectrum, i.e. form 

1 ^ - 

S xx (Jk) = jJ2 (10.92) 

1 = 1 


where the j\ surround fk- 

As before, we can argue that (2S xx (f) ■ l)/S xx (f ) is distributed as xh and 


Var(5„(/)) _ 1 

SUf) ~ l 


(10.93) 


The resolution bandwidth before smoothing is approximately l/T, but after smoothing it is 
approximately l/T since / neighbouring values are averaged. This method is effectively the 
same as the indirect method (Method (2)). 

Note that one might combine both segment averaging and frequency smoothing to get 
an estimate with 2 Iq degrees of freedom and then the resolution bandwidth is approximately 
Iq/T. 


Confidence Intervals for Spectral Estimates 

We now discuss the ‘interval estimates’ based on the point estimates for the smoothed spectral 
density function S xx {f). We have seen that nS xx (f)/ S xx {f) is distributed as a Xn random 
variable where the probability density function for x„ is °f the form shown in Figure 10.21, 
i.e. the values taken by nS xx (f)/S xx (f) are much more likely to fall within the hump than the 
tail or near x = 0. 
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P x i(x) 



Figure 10.21 The creation of confidence intervals 


If we choose a number a (0 < a < 1) such that sections of area a /2 are as shown marked 
off by the points x n ^/2 and x n ,i_ a / 2 , then the following probability statement can be made: 


P 


Xn,a/2 


< 


nS xx (f) 

S xx (f) 


— -Li, 1— a/2 


= 1 — a 


(10.94) 


The points x n , a /2 and x nj i _ ff /2 can be obtained from tables of x„ f° r different values of a. Now 
the inequality can be solved for the true spectral density S xx (f) from the following equivalent 
inequality: 


nS xx (f) ^ c , nS xx (f) 

— Jxxkj ) — 

Xn,\—a/2 2C nc[ j2 


(10.95) 


Thus, for a particular sample value S xx (f ) (a point estimate), the 100(1 — a) % confidence 
limits for S xx {f ) are 


—"—~S xx (f) and —~S xx {f) (10.96) 

af/2 -*- n ,a/2 

and the confidence interx’al is the difference between these two limits. 

Note that on a linear scale the confidence interval depends on the estimate S xx ( f ), but 
on a log scale the confidence limits are 

logf — — ) + log (&,(/)) and log ( — ) + log ($„(/)) (10.97) 

\X n , l-o/2/ \-V/2/ 

and so the interval is log(rc/.v„ iC( / 2 ) — log(n/jc„,i_ a / 2 ) which is independent of S xx (f). Thus, 
if the spectral estimate S xx (f ) is plotted on a logarithmic scale, then the confidence interval 
for the spectrum can be represented by a constant interval about the estimate. Figure 10.22 
indicates the behaviour of «/*„,«/ 2 and n/x n ,i-a /2 (Jenkins and Watts, 1968). 

From Figure 10.22, we can clearly see that the confidence interval decreases as the number 
of degrees of freedom/? increases. For example, if n = 100 (approximately 50 averages for the 
segment averaging method), the 95 % confidence interval is about [0-77 S xx (f), 1.355 x . l (/)]. 
Sometimes, the number of degrees of freedom is referred to as the ‘statistical degrees of 
freedom (Stat DOF)’, and more than 120 degrees of freedom is often required in many random 
vibration testing standards (e.g. MIL-STD-810F and IEC 60068-2-64). 
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M10.1,10.3 



Figure 10.22 Confidence interval limits 

10.5 CROSS-SPECTRAL DENSITY FUNCTION” 10 1 10 3 

The basic considerations given in Section 10.4 relate also to cross-spectral density function 
estimation together with some additional features, but we shall not go into any detail here. 
Detailed results can be found in Jenkins and Watts (1968). We shall merely summarize some 
important features. 

The raw cross-spectral density function can be obtained by Fourier transforming the raw 
cross-correlation function, i.e. 

T 

S X y(f)= J R h xy (T)e j2jrfr dz (10.98) 

-T 

and this has the same unsatisfactory properties as the raw power spectral density function. 
Thus, as before, a lag window w( r) is introduced to smooth the estimate, i.e. 

T 

S xy (f) = J R b xy {r)w{r)e^ ]2ltfT dr 

-T 


(10.99) 
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Note that the unbiased estimator R xy { r) may also be used in place of R xy { t) provided that the 
maximum lag r max is relatively small compared with T. Alternatively, the smoothed estimate 
can be obtained by the segment averaging method or by frequency smoothing of the raw cross- 
spectral density. For example, if the segment averaging method is used the raw and smoothed 
cross-spectral density functions are 

Sxy,(f) = y [* *„. (/)}%(/)] for i = 1, 2, . . . , q (10.100) 

1 q 

S xy (f)=-J2 § * y {f'> ( 10 . 101 ) 

C 1 ;=i 


The smoothed estimate S xy (f) may be written in the form 


Sxy(f) = |&,(/)| €>”***>& (10.102) 

Roughly speaking, one can show that the variances of the amplitude |iS jy (/)| and the 
phase aigS xy (f) are proportional to 1 / BT where B is the resolution bandwidth and T 
is the data length. 

Whilst the general effect of smoothing is much the same as for the power spectral 
density estimate, we note in addition, though, that the amplitude and phase estimators 
are also strongly dependent on the ‘true’ coherence function y xy (f)- So, as Jenkins and 
Watts (1968) observed, the sampling properties of the amplitude and phase estimators 
may be dominated by the ‘uncontrollable’ influence of the coherence spectrum y xy (f) 
rather than by the ‘controllable’ influence of the smoothing factor 1/ BT. For example, 
the variance of the modulus and phase of S xy (f) are shown to be (Bendat and Piersol, 
2000) 


Var(|S, y (/)|) ^ _1 l_ 

\s xy (f )\ 2 ~ )&(/> BT 


Var(arg S xy (f)) 


1 ~ Yxy(f) 1 

yly(f) ' -BT 


(10.103) 


(10.104) 


Note the ‘uncontrollable’ influence of true coherence function y xy (f) on the variability 
of the estimate. Note also that the variance of arg S xy (f ) is not normalized. Particularly, 
if x(t) and y(t) are fully linearly related, i.e. y xy (f) = 1, then Var(arg Sxy(f)) ~ 0. Thus, 
we see that the random error of the phase estimator is much smaller than that of the 
amplitude estimator. 

Similar to the power spectral density estimate, in general, the estimator Sxy(f) 
is approximately unbiased when T is sufficiently large and the resolution bandwidth 
is narrow. However, there is another important aspect: since R xy ( r) is not (in general) 
symmetric, it is necessary to ensure that its maximum is well within the window u>(r) 
or serious bias errors result. For example, if y{t) = x{t — A), then it can be shown that 


COHERENCE FUNCTION 


349 


(Schmidt, 1985b) for rectangular windows 

E [&,(/)] « ( 1 - S X y(f) (10. 105) 

where T r is the length of window (or the length of segment). Note that the time delay 
between signals results in biased estimates (see MATLAB Example 10.3). This problem 
may be avoided by ‘aligning’ the two time series so that the cross-correlation function 
Rxy(z ) has a maximum at r = 0. 


10.6 COHERENCE FUNCTION™ 10 2 


The estimate for the coherence function is made up from the estimates of the smoothed power 
and cross-spectral density functions as 


YlyUl = 


\S xy (f )\ 2 

S xx (f)S yy (f) 


(10.106) 


It should be noted that if ‘raw’ spectral density functions are used on the right hand side of the 
equation, it can be easily verified that the sample coherence function y xy (f ) is always ‘unity’ 
for all frequencies for any signals x and y (even if they are unrelated). 

Detailed calculations are given in Jenkins and Watts (1968) for the sampling properties 
of the smoothed coherence function y xy (f), but roughly speaking, the variance of y xy (f ) is 
proportional to l/BT (which is known, so a controllable parameter) and also depends on 
Yxyif ) (which is unknown, so is an uncontrollable parameter), where the variance of y xy (f) 
is shown to be 


Var (/*(/)) _ 2 (1 - K^,(/)) 2 1 

(y&f)f ~ 4(/> ' BT 


(10.107) 


This expression is sometimes used as an approximate guide after measurements have been 
made by replacing y xy (J) with y~ y (f). 

Jenkins and Watts (1968) show that the bias of this estimator is proportional to the square 
of the derivative of the phase spectrum arg S xy (f). For example, if the Hann window is used 
the normalized bias error can be expressed by 



(10.108) 


for large T (total data length), where T w is half of the lag window length as defined in 
Table 10.1. For the Parzen window, 0.126 is replaced by 0.304. The above equation means 
that the estimator is sensitive to delays between x(t) and y(t). Similar to the cross-spectral 
density function estimate, such bias can be reduced by realigning the processes, i.e. aligning 
the peak in the cross-correlation between x(t) and y(t) to occur at zero lag. 
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Note also that, if x{t) and y(t) are the input and output of a lightly damped oscillator, 
severe bias errors are likely to occur at resonant (and anti-resonant) frequencies where 
the phase changes rapidly. Since the resolution bandwidth is inversely proportional to 
the length of the lag window (e.g. for the Hann window B — 1.333/Tm as shown in 
Table 10.2), the bias error can be reduced by improving the resolution of the estimate, i.e. 
the resolution bandwidth B should be reduced (see MATLAB Example 10.2). In Figure 
10.23, the coherence function is estimated for simulated input/output results for a lightly 
damped oscillator (f = 0.02) with natural frequency at 1 Hz. The theoretical value of 
coherence is unity and the resolutions used are shown in the figure, where it can be seen 
that the bias is reduced as the resolution bandwidth B decreases. 

If the resolution bandwidth is chosen adequately, the bias of Y xy {f) may be approx- 
imated (Carter et al., 1973) by 

b ( ? X y(f )) « (1 ~ gy (/)) (10.109) 

This shows that the estimate y xy (f) is asymptotically unbiased (i.e. for large BT). 



Figure 10.23 Bias in the coherence function estimate 


10.7 FREQUENCY RESPONSE FUNCTION 

The frequency response function is estimated using smoothed spectral density functions. For 
example, the estimate of H\(f) can be obtained from 

s xy m 

Hl (/) = y^ (io.no) 

^XX\J ) 

Note that we use the notation H\{f) to distinguish it from the theoretical quantity H\(f), 
though it is not explicitly used in Chapter 9. The results for errors and confidence limits can 
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be found in Bendat and Piersol (1980, 2000) and Otnes and Enochson (1978). A few results 
from Bendat and Piersol are quoted below. 

Bias errors in the frequency response estimate H\(f) arise from: 

(a) bias in the estimation procedure; 

(b) nonlinear effects; 

(c) bias in power spectral and cross-spectral density function estimators; 

(d) measurement noise on input (note that uncorrelated output noise does not cause bias). 


In connection with (a), we would get bias effects since 


E [#!(/)] = E 


SxyUY 

_Sxx(f)_ 




E [£,(/)] 
E [S xx (f)] 


i.e. E[H\{f)\ x/x (_/). However, this effect is usually small if BT is large. In connection with 
(b), use of Equation (10.110) produces the best linear approximation (in the least squares sense) 
for the frequency response function. In connection with (c), bias in the power spectral and 
cross-spectral density function may be significant at peaks and troughs. These are suppressed 
by having narrow resolution bandwidth. In connection with (d), we have already discussed this 
in Chapter 9 (i.e. various FRF estimators H 2 (f) and H w (f) (or H T (f)) are discussed 

to cope with the measurement noise). 


Finally, the variances of the modulus and phase of H\ (/) are 


Var (| •&!(/)!) ^ 

, 1 - Y xy (f) 

1 

(10.111) 

l#i(/)| 2 

Y xy (f) 

2 BT 

Var(argHi(/)) - 

.. 1 - Ylyif) 

1 

(10.112) 

Y xy U) 

2 BT 


This shows that, similar to the estimates S xy {f) and y xy (f), the variances depend on both 
the controllable parameter BT and the uncontrollable parameter y xy (f)- Note that the 
right hand sides of Equations (10.1 1 1) and (10.1 12) are the same. Also, comparing with 
the results of the cross-spectral density estimate S xy (f ) shown in Equations (10.103) 
and (10.104), we see that the normalized variance of is smaller than that of 

|5 X> ,(/)|, while the variances of the phase estimators are the same, i.e. Var(arg H\{ f)) = 
Varfarg S xy (f)). In practice, this implies that we may need shorter data length (or fewer 
number of averages) for the FRF estimate than the cross-spectral density estimate. Note 
that, if y xy (f ) = 1, then both VarflH^/)!) and Var(arg H\(f)) approach zero. 


The random errors of Hiif) may be similar to those of H\(f) since the H 2 (f) estimator 
can be thought of as reversing the role of input and output defined for H\{f) in the optimization 
scheme (discussed in Chapter 9). The random errors of Hw(f ) (or Hr(f)) are not as obvious 
as the others. However, if there is no measurement noise it can easily be seen that all three 
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theoretical quantities are the same, i.e. H\(f) = H 2 (f) — H w (f). Thus, apart from the error 
due to the measurement noise, we may anticipate that the random errors are similar to the 
H\(f ) estimator. Details of the statistical properties of Hw(f ) can be found in White et al. 
(2006). 

We summarize the normalized random errors of various estimates in Table 10.3, where 
the factor BT can be replaced by the number of averages q for the segment averaging method 
(assuming that the data segments used are mutually uncorrelated). 


Table 10.3 Random errors for some smoothed estimators 


Estimator 

cr(&) 

Random error , e r = 

0 

S X y(f) 

Vbt 

|-M/)| 

1 

£r ~ \vxy(f)\VWr 

arg S xy (f) 

[l _ (f)] 1/2 

cr (arg S xy (f)) RJ ? 

\y xy (f)\V2BT 


[1 -Y? y (f)] 1/2 
\y xy (f)\ V2BT 

arg Hi(f) 

[1 _ y 2 (f)] 1/2 

a (arg #,(/)) « 

^ \y xy (f)\V2BT 

£</) 

_V2[l -y 2 y (f)] 
\yMVbt 


10.8 BRIEF SUMMARY 

1 . Estimator errors are defined by 

Bias: b(&) = £[4>] - 0 
Variance: Var(<t>) = £[<i> 2 ] - £ 2 [$] 

Mean square error: mse(4>) = Var(<J>) + b 2 (i>) 
The normalized errors are 

Bias error: St = b(4>)/cl> 

Random error: e r = cr (<J>)/ 0 

RMS error: e = J mse(<i>)/0 
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2. R xx (r) and R xy (t) are the unbiased autocorrelation and cross-correlation function 
estimates; however, the biased (but asymptotically unbiased) estimators R xx ( t) and 
R xy ( r) have a smaller mean square error. When unbiased estimators are used, the ratio 
of the maximum lag to the total data length, r max / T, should not exceed 0.1. 
Correlation functions may be estimated with arbitrarily small error if the length of the 
data is sufficiently long. 

3. The ‘raw’ power spectral density function S xx (f ) is an asymptotically unbiased esti- 
mator; however, the variance of S xx (f) is Var ( S xx (f )) = S xx (f). 

4. The ‘smoothed’ power spectral density function S xx (f) can be obtained by 

OO 

&,(/)= f R b xx (T)w(T)e- J2nfz dT 

— OO 

or 

S xx (f) =-J2 where -W/) = ^ |Xr„(/)| 2 

9 j=1 'r 

5. The bias error of S xx (f ) is usually small if S xx (f ) is smooth. However, the estimator 
S xx (f ) usually underestimates the peaks and overestimates the troughs (i.e. dynamic 
range is reduced). The bias error can be reduced by improving the resolution band- 
width. The resolution bandwidths and approximate bias errors for various lag windows 
are shown in Table 10.2. 

6. The variance of S xx (f) is given by 

Var (S„(/)) 1 

s 2 xx (f) ~ bt 

where the BT can be replaced by the number of averages q for the segment averaging 
method (the number of degree is n =2 BT). The random error is reduced as the product 
BT becomes large. However, in general, we need to trade-off between the resolution 
(bias error) and the variability (random error). 

While maintaining the good resolution (low bias) the only way to reduce the random 
error is by increasing the data length T . 

7. The cross-spectral density function estimate S xy (f) has similar statistical properties 
to those of S xx (f). However, this estimator depends on the ‘true’ coherence function 
Yxy(f ) which is an ‘uncontrollable’ parameter. 

Time delay between two signals x{t) and y(t) can introduce a severe bias error. 

8. The statistical properties of the coherence function estimate y xy (f) depend on both the 
true coherence function y xy (f) and the product BT. The random error is reduced by 
increasing the product BT. However, significant bias error may occur if the resolution 
bandwidth is wide when arg S xy (f) changes rapidly. 

9. The estimator Hi(f) also depends on both the controllable parameter BT and the 
uncontrollable parameter y xy (f). 

The random errors of various estimators are summarized in Table 10.3. 
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10.9 MATLAB EXAMPLES 


Example 10.1: Statistical errors of power and cross-spectral density functions 

Consider the same example (2-DOF system) as in MATLAB Example 9.4, i.e. 

h(t) = — _| — e“ f2 ®”2< s[ nc0d2 t 

a>d t a>d2 

Again, we use the white noise as an input x(t), and the output y{t) is obtained by 
y(t) = h(t) * * x(t). However, we do not consider the measurement noise. 

Since we use the white noise input (band-limited up to f s /2, where f s is the sam- 
pling rate, i.e. a 2 = S-'j 2 /2 $xx(f)df = 1), the theoretical spectral density functions are 
Sxx(f) = a 2 /f s , Syy(f) = \H{f)\ 2 o 2 /f s and S xy (f) = H{f)a 2 /f s . These theoretical 
values are compared with the estimated spectral density functions. 

The segment averaging method is used to obtain smoothed spectral density functions 
Sxx(f)> Syy(f) and S xy (f). Then, for a given data length T, we demonstrate how the 
bias error and the random error change depending on the resolution bandwidth B w 1 / T r , 
where T r is the length of the segment. 


Line MATLAB code 


Comments 


1 clear all 

2 Al=20; A2=30; fl=5; f2=15; wnl=2*pi*fl; 
wn2=2*pi*f2; 

3 zetal=0.02; zeta2=0.01; 

4 wdl=sqrt(l-zetal A 2)*wnl; 
wd2= sqrt( 1 -zeta2 A 2) * wn2 ; 

5 fs=100; Tl=10; tl=[0:l/fs:Tl-l/fs]; 

6 h=(Al/wdl)*exp(-zetal*wnl*tl). 
*sin(wdl*tl) + (A2/wd2) 

*exp(-zeta2*wn2*tl).*sin(wd2*tl); 

7 T= 2000; %T= 10000; 

8 randn('state',0); 

9 x=randn(l,T*fs); 

10 y=filter(h,l,x)/fs; % scaled appropriately. 


1 1 Tr=4; N=Tr*fs; % Tr=20; 

12 [Sxx, f]=cpsd(x,x, hanning(N),N/2, N, fs, 
'twosided'); 

13 [Syy, f]=cpsd(y,y, hanning(N),N/2, N, fs, 
'twosided'); 


Same as MATLAB Example 9.4, except 
that the damping ratios are smaller, i.e. 
we use a more lightly damped system. 


Define the data length T seconds. First, 
use T = 2000, then compare the results 
with the cases of T = 10000 (when Tr = 
20 is used at Line 11). 

Generate the white noise input sequence 
‘x’ (cr^ = 1), and then obtain the output 
sequence ‘y* (scaled appropriately). 

Define the length of segment Tr 
seconds. First, use Tr = 4 
(approximately 1000 averages), then 
compare the results with the cases of Tr 
= 20 (approximately 200 averages). 
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14 [Sxy, f]=cpsd(x,y, hanning(N),N/2, 
N, fs, 'twosided'); 

15 H=fft(h)/fs; % scaled appropriately. 

1 6 fl =fs*(0:length(H)- 1 )/length(H); 


17 figured) 

18 plot(f,10*logl0(fs*Sxx), f, zeros(size(f)). 'r:') 

1 9 xlabel('Frequency (Hz)') 

20 ylabel('Estimate of \itS_x_x\rm(\itf\rm) (dB)') 

21 axis([0 30 -10 10]) 

22 figure(2) 

23 plot(f,10*logl0(Syy), 
fl,10*loglO(abs(H). A 2/fs), 'r:') 

24 xlabel('Frequency (Hz)') 

25 ylabel('Estimate of \itS_y_y\rm(\itf\rm) (dB)') 

26 axis([0 30 -100 -20]) 

27 figure(3) 

28 plot(f,10*logl0(abs(Sxy)), 
fl,10*logl0(abs(H)/fs), 'r:') 

29 xlabel('Frequency (Hz)') 

30 ylabel('Estimate of |\itS_x_y\rm(\itf\rm)| (dB)') 

31 axis([0 30 -60 -20]) 

32 figure(4) 

33 plot(f,unwrap(angle(Sxy)), 
fl,unwrap(angle(H)), 'r:') 

34 xlabel('Frequency (Hz)') 

35 ylabelf'Estimate of arg\itS_x_y\rm(\itf\rm) 
(rad)') 

36 axis([0 30 -3.5 0]) 


Obtain the spectral density 
estimates using the segment 
averaging method (Hann window 
with 50 % overlap is used). Also, 
calculate H(f) by the DFT of the 
impulse response sequence (scaled 
appropriately). 

Plot the power spectral density 
estimate S xx (f), where ‘fs’ is 
multiplied. Also, plot the variance 
of the signal ed = S xx (f ) ■ f, = 1 
(0 dB) for comparison. 

Plot the power spectral density 
estimate S yy (f ) and the theoretical 
power spectral density function 
S yy (f) = \mf)\ 2 a?/f s . 


Plot the magnitude spectra of 
S xy (f) and S xy (f) = H(f)ol/f s . 


Plot the phase spectra of S xy (f ) and 
S X y(f)- 


Results: Case (a) T r = 4 seconds at Line 1 1 and T = 2000 at Line 7 (1000 averages) 




(al) Power spectral density function S (/) 


(a2) Power spectral density function (/) 
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Comments: Since the Hann window is used, the resolution bandwidth is B 1.33 /T w ^ 

0.67 Hz, where T w ~ T r /2. Note that both S yy (f) and \S xy (f)\ underestimate the peaks 
and overestimate the trough owing to the bias error. 

Results: Case (b) T r = 20 seconds at Line 1 1 and T = 2000 at Line 7 (200 averages) 





Comments: In this case, the resolution bandwidth is B 0.13 Hz. It can be shown that 
the bias errors of spectral density estimates S yy (f) and S xy (f ) are greatly reduced owing 
to the improvement of the resolution. However, the random error is increased since the 
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number of averages is decreased. Note that arg S xy (f) has much less random error than 
|&c;y(/)| (almost no random error is present in this example since y xy (f) = 1 ). 

Results : Case (c) T r — 20 seconds at Line 11 and T = 1 0 000 at Line 7(1 000 averages) 





Comments: While maintaining the narrow resolution bandwidth, increasing the number 
of averages results in better estimates. 

Comments on the segment averaging method: As mentioned in Section 10.4, the 
underlying assumption for the segment averaging method is that each segment of the data 
must be uncorrelated. If it is correlated, the random error will not reduce appreciably. To 
demonstrate this, use T = 2000 and Tr = 20 (i.e. Case (b)), and add the following script 
between Line 9 and Line 10. Then run this MATLAB program again and compare the 
result with Case (b). 


x=[x 2*x 3*x 4*x 5*x]; x=x-mean(x); x=x/std(x); 

Now, the total data length is 5 x 2000 = 10 000 seconds, so the number of averages 
is approximately 1000 which is the same as in Case (c). However, the random error will 
not reduce since correlated data are repeatedly used. For example, the results of S xx {f) 
and |5 ly (/)| are shown in Figures (d). 
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Example 10.2: Bias error of the coherence function estimate 


In the previous MATLAB example, we did not consider the coherence function estimator 
yUf). We shall examine the bias error of y xy {f). using the same system as in the previous 
example. Note that the half-power point bandwidths at resonances (f\ = 5 and = 15) 
are B rl = 2 ij/i = 0.2 Hz and B r 2 = 2^/2 = 0.3 Hz 

As mentioned in Section 10.6, considerable bias error may occur at resonances and 
anti-resonances where the phase of the cross-spectral density function changes rapidly, 
e.g. if the Hann window is used the normalized bias error is (i.e. Equation (10.108)) 


Yxy(f) 


0.126 

T 2 

U) 



2 


In this example, various resolution bandwidths are used: B\ = 1 Hz, B 2 = 0.5 Hz, B 3 = 
0.2 Hz and B 4 = 0.05 Hz. For each resolution bandwidth, approximately 1000 averages 
are used so that the random error is negligible. A Hann window with 50 % overlap is used. 
The length of each segment for the Hann window is obtained by T r = 2 T m & 2 x 1.33 /B, 
where B is the resolution bandwidth (see Table 10.2). 


Line 

MATLAB code 

Comments 

1 

clear all 

Same as MATLAB Example 10.1. 

2 

Al=20; A2=30; fl=5; f2=15; wnl=2*pi*fl; 
wn2=2*pi*f2; 


3 

zeta 1=0.02; zeta2=0.01; 


4 

wd 1 = sqrt( 1 -zeta 1 A 2) * wn 1 ; 
wd2=sqrt(l-zeta2 A 2)*wn2; 


5 

fs=100; Tl=10; tl=[0:l/fs:Tl-l/fs]; 



6 h=(AlAvdl)*exp(-zetal*wnl*tl). 

*sin(wdl*tl) + (A2/wd2)*exp(-zeta2*wn2*tl). 
*sin(wd2*tl); 
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7 

B 1=1; B2=0.5; B3=0.2; B4=0.05; 

Define the resolution bandwidths: B 1 , 

8 

Nl=fix( 1 ,33*2/B 1 *fs); N2=fix(l ,33*2/B2*fs); 

B2, B3 and B4. Then, calculate the 

9 

N3=fix(1.33*2/B3*fs); N4=fix(1.33*2/B4*fs); 

number of points of a segment for each 

10 

Ns=50(); Nt=N4*Ns; 

bandwidth. Define the total number of 

11 

randn('state',0); 

segments Ns = 500 that results in 

12 

x=randn(l,Nt); 

approximately 1000 averages if 50 % 

13 

y=filter(h,l,x); 

overlap is used. 


% we do not scale for convenience. 

Generate white noise input sequence 
‘x’ and the output sequence ‘y\ 

14 

[Gamma.l, f] = mscohere(x(l:Ns*Nl), 

Calculate the coherence function 


y(l:Ns*Nl), hanning(Nl), [], N4, fs); 

estimates y^yif) for each resolution 

15 

[Gamma_2, f] = mscohere(x(l:Ns*N2), 

bandwidth using the MATLAB 


y(l:Ns*N2), hanning(N2), [], N4, fs); 

function ‘mscohere’. 

16 

[Gamma_3, f] = mscohere(x(l:Ns*N3), 

Also, calculate H(f) by the DFT of 


y(l:Ns*N3), hanning(N3), [], N4, fs); 

the impulse response sequence. We 

17 

[Gamma_4,f] = mscohere(x(l:Ns*N4), 

calculate this to compare y^ y (f) and 


y(l:Ns*N4), hanning(N4), [], N4, fs); 

arg H(f). Note that arg H( f) = 

18 

H=fft(h, N4); 

% we do not scale for convenience. 

arg S xy (f). 

19 

figure (1) 

Plot the coherence function estimates 

20 

plot(f, [Gamma_l Gamma_2 Gamma_3 

Yxy(f) f° r each resolution bandwidth. 


Gamma_4]) 

21 

xlabel('Frequency (Hz)') 


22 

ylabel('Estimate of \it\gamma_x_y\rm A 
2(\itf\rm)') 


23 

axis([0 30 0 11) 


24 

figure(2) 

Plot arg H(f ) which is the same as 

25 

plot(f,unwrap(angle(H( 1 :length(f))))) 

arg S„(f). 

26 

xlabel('Frequency (Hz)') 


27 

ylabel('arg\itH\rm(\itf\rm) = 
arg\itS_x_y\rm(\itf\rm) (rad)') 


28 

axis([0 30 -3.5 ()]) 



Results 



(a) Coherence function estimate Y*y(/) 


(b) Phase spectrum of H(f ) or S^{f) 


Comments: Note the large bias error at the resonances and anti-resonance. Also note 
that the bias error decreases as the resolution bandwidth gets narrower. 
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Another point on the bias error in the coherence function is that it depends on the 
‘window function’ used in the estimation (Schmidt, 1985a). For example, if we use a 
rectangular window, i.e. replace ‘hanning’ in Lines 14-17 with ‘rectwin’, then we may 
not see the drop of coherence at resonances as shown in Figure (c). Readers may care to 
try different window functions. 



(c) Coherence function estimate y 2 ^ (/) (rectangular window function is used) 


Example 10.3: Bias error of the cross-spectral density function estimate (time delay 
problem) 

In Section 10.5, we mentioned that the cross-spectral density function estimate S xy (f) 
produces a biased result if time delay is present between two signals. For example if 
y (t) = x{t — A), then the average of S xy (f) is (i.e. Equation (10.105) for a rectangular 
window) 

£[&,(/)] »(i-^)w) 

In this example, we use the white noise signal for x(t) (band-limited up to f s / 2), and 
y(t) — x(t — A) where A = 1 second. Since it is a pure delay problem, the cross-spectral 
density function is 

S xy (f) = e~M A S xx (f) 

i.e. |5^(/)| = S xx (f) = ahf s (see MATLAB Example 10.1). 

We shall examine the bias error of S xx (f) for various values of T r . Note that the 
bias error can only be reduced by increasing the window length (in effect, improving the 
resolution) or by aligning two signals (Jenkins and Watts, 1968), e.g. y(t) may be replaced 
by y'(t) — y(t + A) if A can be found from the cross-correlation function (arg S xy (f) 
must be compensated later). 
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Line 

MATLAB code 

Comments 

i 

clear all 

Define the delay A = 1 second and the 

2 

delta= 1 ; fs=20; 

window length T r . We compare the 

3 

Tr= 1.1;% Tr= 1.1,2, 5,50; 

results of four different window 

4 

N=Tr*fs; Nt=1000*N; 

lengths T r = 1.1, 2, 5 and 50 seconds. 
‘N’ is the number of points in the 
segment and ‘Nt* is the total data 
length. 

5 

randn('state',0); 

Generate white noise sequence ‘x’ and 

6 

x=randn( 1 ,Nt+delta*fs); 

the delayed sequence *y\ (Note that 

7 

y=x( 1 :length(x)-delta*fs); 

o * = o * = 1 .) Then, calculate the 

8 

x=x(delta*fs+ 1 :end); 

cross-spectral density function 

9 

[Sxy, f]=cpsd(x,y, rectwin(N), 0, 1000, fs. 

estimate S xy (f). In this example, the 


'twosided'); 

rectangular window with no overlap is 
used. So, the number of averages is 
1000. 

10 

figure (1) 

Plot the magnitude spectrum of S xy (f ) 

11 

plot(f,fs*abs(Sxy), f, ones(size(f)), 'r:') 

(multiplied by the sampling rate) and 

12 

xlabel('Frequency (Hz)') 

the theoretical value which is unity 

13 

ylabel('Estimate of |\itS_x_y\rm(\itf\rm)| 
(linear scale)') 

(note that |Szy(/)| • f s = cr x = 1). 

14 

axis([0 10 0 1.1]) 


15 

figure(2) 

Plot the phase spectrum of S xy (f ) and 

16 

plot(f, unwrap(angle(Sxy)), [0 10], 

the theoretical value of arg S xy (f ) 


[0 -2*pi*10*delta], 'r:') 

which is — 2tt/ A. 

17 

xlabel('Frequency (Hz)') 

Run this MATLAB program again 

18 

ylabel('Estimate of arg\itS_x_y\rm(\itf\rm) 
(rad)') 

using different values of T r . 

19 

axis([0 10 -65 0]) 



Results 



Frequency (Hz) 

(al)|.S,,(/)| using 7) = 1.1 



4 5 6 7 8 

Frequency (Hz) 

(a2) arg S v (f) using T r = 1.1 
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(c 1 ) |s„(/)| using T r = 5 ( c2 ) “8 using T r = 5 



Frequency (Hz) Frequency (Hz) 

(dl) |.S„(/)| using T r = 50 (d2) arg 5„(/) using T r = 50 

Comments: Note that a significant bias error occurs if the window length T r is short. 
However, it is interesting to see that arg S xy (f) is almost unaffected as long as T r > A, 
as one might expect from Equation (10.105). 


11 

Multiple-Input/Response Systems 


Introduction 

This chapter briefly introduces some additions to the work presented so far. The natural 
extension is to multiple-input and multiple-output systems. The concepts of residual spectra 
and partial and multiple coherence functions offer insight into the formal matrix solutions. 
Finally principal component analysis is summarized and related to the total least squares 
method of Chapter 9. 


11.1 DESCRIPTION OF MULTIPLE-INPUT, MULTIPLE-OUTPUT 
(MIMO) SYSTEMS 

Consider the multiple-input, multiple-output system depicted in Figure 11.1. 

Assuming that the system is composed of linear elements, then any single output y,(f) 
(say) is 

m 

i = 1 

where h ;I (0 is the impulse response function relating the ith input to the jth output. Fourier 
transforming this yields 

m 

Yj(f) = Y, «;/(/)*<(/) (11.2) 

i = 1 

where Hjj(f) is the frequency response function relating the ith input to the j th response. 
The Fourier transform of the set of all responses can be arranged as a vector as 

Y(/) = H(/)X(/) (11.3) 
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A 
yi 

y« 

Figure 11.1 A multiple-input, multiple-output system 



where Y(/) is an n x 1 vector of responses, X(/) is an nix 1 vector of inputs and H(/) 
is n x m matrix of frequency response functions. For simplicity of notation we write the 
transforms as X(/) rather than Xr(/), implying a data length T . Also, we imply below the 
proper limitation as T — »■ oo etc. 

From this the n x n output spectral density matrix = £[Y*(/)Y r (/)] is 


Syy(/) = H*(/)S XX (/)H r (/) 


(11.4) 


where Sxx(f) is the m x m input spectral density matrix. This expression generalizes 
Equation (9.8). Note that both these matrices Sxx(f) and Syy(/) include cross-spectra relating 
the various inputs for Sxx(f) and outputs for 

Similarly, the input-output spectral density matrix may be expressed as Sxy(/) — 
E[X*(f)Y T (f)], which becomes 


Sxy(D = 5 XX (/)H r (/) 


(11.5) 


This is the generalization of Equation (9.12). It is tempting to use this as the basis for ‘identi- 
fication’ of the matrix H r (/) by forming 

H r (/) = ■S'xx(/)Syy(/) (11.6) 


Immediately apparent is the potential difficulty in that we need the inverse of Sxx(f), which 
might be singular. This arises if there is a linear dependency between inputs, i.e. if at least 
one input can be regarded as a linear combination of the others. Under these circumstances 
the determinant of Sxx(/) is zero and the rank of Sxx(/) is less than m. The pseudo-inverse 
of S\x(f) may be employed but this is not followed up here. 


11.2 RESIDUAL RANDOM VARIABLES, PARTIAL AND MULTIPLE 
COHERENCE FUNCTIONS 

The matrix formulation in Equation (11.6) is a compact approach to dealing with multiple- 
input, multiple-output systems. However, there are other approaches aimed at revealing and 
interpreting the nature and relative importance of signals and transmission paths in systems. 
One such method is described below. This is demonstrated here by using a very simple 
example, namely a two-input, single-output system. This can easily be generalized to more 
inputs - and for more outputs each output can be taken in turn. 

Let us start by saying that we measure three signals ( x\(t ), X 2 (t), X 3 (t)) and wish to know 
how these signals may be related. An approach to this would be to ‘strip out’ progressively 
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the ‘effect’ of first one signal on the other two, and then what is left of the next from the 
last remaining one (and so on if we had more signals). This ‘stripping out’ of one signal’s 
effect on another yields what is called a ‘residual’ process. Comprehensive studies on residual 
processes can be found in Bendat (1976a, 1976b, 1978) and Bendat and Piersol (1980, 2000). 
We illustrate how this approach may be helpful by choosing any one of the three (say x^(t)) 
and identifying it as an ‘output" y(t ) arising from inputs x\ (t) and X 2 (f). 

So we consider a two-input, single-output system with some uncorrelated output mea- 
surement noise as shown in Figure 11.2. 



*3 (=y) 


Figure 11.2 Two-input, single-output system 

On the basis of the measurements taken, this is a three-component process, x\ , X 2 , x^(= y), 
where we reiterate that it may be convenient (but not necessary) to regard y as an output. Based 
on the assumed structure we might wish to quantify: 

1. The relative magnitude of noise to ‘linear effects’, i.e. how much of y is accounted for by 
linear operations on x\ and X 2 - 

2. The relative importance of inputs x\ and X 2 , i.e. how much of y comes from each of x\ and 

X 2 . 

3. The frequency response functions H\ and H 2 (i.e. estimate H\ and Hi from x\, X 2 and y). 

To start with, it is useful to remind ourselves of the concept and use of the ordinary coherence 
function. With reference to Figure 11.3, suppose we have two signals x, y and we seek a linear 
‘link" between them. Then, Figure 1 1.3 may be redrawn as Figure 1 1.4. 

1" 

X ► H «-V 


Figure 11.3 A single-input, single-output system with measurement noise on the output 


n 

| y c ( fully coherent with x) 


y uc (uncoherent with x) = n 


Figure 11.4 Alternative expression of Figure 1 1.3 
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If we try to estimate the ‘best’ linear operation (optimal filter L) on x that minimizes 
£[y^ c ], then its frequency response function is given by 


S x Jf) 

i(/)=/^ (H.7) 

^XX\J ) 

Also, the coherent output power is S >v>v (/) = Y xy (f)S yy (f ) and the uncoherent (noise) power 
is Sy ul y uc (f) = [1 — Y xy (f)\S yy (f). In fact, y uc is the residual random variable resulting from 
y after a linear prediction of y based on x has been subtracted. Note that, in Figure 11.4, 
the noise is interpreted as what is ‘left’ in the output after the linear effects of x have been 
removed. 

We now return to the problem of three processes xi, X 2 , x$(== y). Figure 11.2 can be 
decomposed into the two stages below, as shown in Figure 1 1.5. 


Stage 1 Stage 2 



Figure 11.5 Alternative expression of Figure 1 1 .2 


We should emphasize that we assume that we can only use the three measured signals 
X\, X2 and * 3 )= y). Furthermore we restrict ourselves to second-order properties of stationary 
random processes. Accordingly, the only information we have available is the 3x3 spectral 
density matrix linking xi, X 2 and Xi,(= v). All subsequent manipulations involve the elements 
of this (Hermitian) matrix. 

Stage 1 

In Figure 11.5, Stage 1 depicts the ‘stripping out’ (in a least squares optimization sense) of 
the signal x\ from X 2 and X 3 (= y). The signal denoted xzi is therefore what is left of X 2 when 
the linearly correlated part of x\ has been removed. The notation X 2.1 denotes the ‘residual’ 
random variable. Similarly, X 3.1 denotes what is left of X 3 when the linearly related part of x\ 
has been removed. 

To put a physical interpretation on this - it is as though process x\ is ‘switched off" and 
X 2 -i and X 3.1 are what remains of X 2 and *3 when this is done. (Once again we emphasize 
that this switching off is in a least squares sense. Thus it picks out the linear link between the 
signals.) The linear links between x\ andx 2,*3 are denoted Li andZ. 2 . These and the following 
quantities can be expressed in terms of spectra relating the residual random variables. It should 
be noted that the filters L\ and L 2 are mathematical ideal (generally non-causal) linear filters - 
not ‘physical" filters (and so should not be identified with H\ and Hi). This introduces the 
concept of residual spectral densities and partial coherence functions: 
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• Residual spectral density functions are ‘usual’ spectral density functions formed from 
residual variables. 

• Partial coherence functions are ordinary coherence functions formed from residual variables. 


First, for the pair x\ and x 2 , the ‘optimal’ linear hlter linking xi and X 2 is 


where Sn(f) is short for SxiX 2 
coherent with x\ is 


Li(J) = 


Sn(f) 


(11.8) 


Sn(f) 

(/) etc. The power spectral density of that part of X 2 which is 


Sy 2 y 2 (f) = Yn(f)S22(f) 


(11.9) 


The ‘noise’ output power is S y3y ,(f) which is written as S221 (/), i.e. 

Sy,y,(f) = S 22 . l(/) = [1 - Y Uf)} S 22 (/) (11.10) 


Similarly, for the pair x\ and x 3 , the optimal filter is 


Liif) = 


Snif) 

Sn(f) 


(11.11) 


The spectral density of y(= x 3 ) is S yy (f) = S 33 (/) = S ytyt {f) + S ysy5 (f), where S ytyt {f) is 
the power spectral density of that part of y that is coherent with x\ and W/) = 533-t(/) is 
the power spectral density of that part of y that is uncoherent with x\, i.e. 


S ym (f) = Y? 3 (f)S 33 (f) (11.12) 

w/) = S 33.1 (/) = [1 - vUf)} S 33 (f ) (11.13) 


From Equations (1 1.10) and (1 1.13), we see that the residual spectral density functions 
S 221 if) and S 33 . 1 (/) are computed from the ‘usual’ spectral density functions and ordinary 
coherence functions. Similarly, the residual spectral density function S 23 .\{f) which is the 
cross-spectral density between X2.1 and x 3 .i can also be expressed in terms of the spectral 
density functions formed from the measured signal x\, x 2 and x 3 . We do this as follows. 

From the definition of the cross-spectral density function, S 23 -i(f) is 


%.!(/) 


lim 

T — >00 


EjXl^Xi.ff)] 

T 


(11.14) 


Since X 2 — Y 2 + X 2 .i — L\X\ + X 2 -\ and X 3 = Y 4 + X 31 = L 2 X\ + X 3 .\, and using 
Equations (1 1.8) and (11.11), it can be shown that 


S 23 i(f) = S 23 (f) - 


S2l(f)Sl 3 (f) 

Su(f) 


(11.15) 


So the residual spectral density S2 3 .i(/) can be computed in terms of the usual spectral density 
functions. 

We now introduce the concept of the partial coherence function. This is the ‘ordinary’ 
coherence function but linking residual random variables. The partial coherence function 
between X2.1 and X 3.1 can be computed using the above results, and is 


V 23 -i(f) 


| S 23 . j (/)| 2 

522.l(/)S 33 .l(/) 


(11.16) 
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Stage 2 

In Figure 1 1 .5, Stage 2 depicts the ‘removal’ of X2.1 from x 3 .i . As before, for the pair X2.1 and 
x 3 -i, the optimal filter is 

523-1 (/) 

L 3 (.f) = 7T— 777 ( 11 . 17 ) 

J22l(/ ) 

In the figure, x 3 .\ i2 denotes the residual variable arising when the linear effects of both x\ and 
X2 are removed from x 3 . The output powers of uncoherent and coherent components with x 2 .\ 
are 

■W/) = Snn(f) = 5 33 . 1 , 2 (/) = [1 - / 2 2 3 .l(/)] ^ 33-1 (/) (H- 18 ) 

Sy 6 y 6 (f)=Yl 3 . 1 (f)S 33 . 1 (f) (H.19) 

Note that S 33 .i,2(/) is the power spectral density of that part of y unaccounted for by linear 
operations on x 3 and x 2 , i.e. the uncorrelated noise power. Now, combining the above results 
and using Figure 1 1.5, the power spectral density of y can be decomposed into 

s yy (f) - s 33 (/) = S ytn (f) + S yey6 (f) + S yiyi (f) 

= /i 2 3 (/)^33(/) P art fully coherent with x 3 

+ K 2 3 i(/1^33 i(/) P art fully coherent with X2 

after xi has been removed from x 2 and X3 
+ [l — y 2 2 3 i(/l] ^331 (/) uncoherent with both xi and X2 (1 1.20) 

This equation shows the role of the partial coherence function. 

Note that by following the signal flow in Figure 1 1.5, one can easily verify that 

Hi(f) = L 2 (f) - Lt(/)L 3 (/) (11.21) 

and 


H 2 (f) = L 3 (/) 


(11.22) 


A multiple coherence function is defined in a manner similar to that of the ordinary 
coherence function. Recall that the ordinary coherence function for the system shown in 
Figure 1 1.3 can be written as (/) = ( S yy (f ) — S„„(f))/S yy (f). Similarly, the multiple 
coherence function denoted Y y . x (f) is defined as 

, Syy(f) - S nn (f ) 

Yy-Af) = ” ( 11 - 23 ) 

^yy\J ) 

That is the multiple coherence function Y y . x (f) is the fraction of output power accounted for 
via linear operations on the inputs; it is a measure of how well the inputs account for the 
measured response of the system. For the example shown above, it can be written as 


YyAf) = 


s 33 (f) - 5 33 .1,2(/) 

S 33 (/) 


(11.24) 


Note that the nearer Y y . x (f) is 1° unity, the more ‘completely’ does the linear model apply 
to the three components. Using Equations (11.12) and (11.18), the above equation can be 
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written as 

Yy-Jf) = 1 - (1 - vUfj) (1 - kJm(/)) (11.25) 

which shows that it can be computed in terms of partial and ordinary coherence functions. See 
Sutton et al. (1994) for applications of the multiple coherence function. 


Computation of Residual Spectra and Interpretation in Terms 
of Gaussian Elimination 

The above example illustrates the methodology. A more general computational formulation 
is given below which operates on the elements of the spectral density matrix of the measured 
signals. The residual spectral density function S 23 .\(f) given in Equation (11.15) can be 
generalized as 


Sij.k(f) = Sy(/) - S ‘ lci P^y ) (11.26) 

bklcV) 

This can be extended as 

Sij.kAf) - Sij. k {f) - 5 ' 7 t( / ) f^ (/) (1 1.27) 

We use the above expressions to ‘condense’ successive cross-spectral density matrices, e.g. 


Suif) S n {f ) Sn if) 

Sn(f) Suif) Suif) 

Snif) Suif) Suif) 


" s 22 .i(f) S 23 .^fy 

_ 5 32 l(/) S 33 -l(/)_ 


=> [5 33 .t, 2 (/)] (11.28) 


This can be extended to larger systems. This ‘condensation’ can be interpreted through Gaus- 
sian elimination (row manipulations) as follows (where r; is the zth row): 


, fSnif)\ (S 31 if)\ . 

Step 1 : /- 2 -»■ r 2 — t\ x j ; ^ - r 3 - n x j gtves 


~Sn(/) 

Snif) 

Snif)' 


"5ii(/) 

Snif) 

Snif) ~ 

Suif) 

Suif) 

Snif) 


0 

S22.tif) 

5 23 .i(/) 

$3l(/) 

Snif) 

Snif) 


0 

532.1 (/) 

5 33 .i(/) 


- /S 3 2.l(/)\ . 

Step2:r 3 ^r 3 -,- 2 x^-^j gtves 


~5n(/) 

Snif) 

Snif) 


" Snif) 

5i 2 (/) 

Snif) " 

0 

5 2 2-i(/) 

5 23 .i(/) 

=> 

0 

5 22 .i(/) 

5 23 .i(/) 

0 

5 3 2-i(/) 

5 33 .i(/) 


0 

0 

5 33 .1,2(/) 


i.e. the residual spectral density functions arise naturally. 
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Further interpretation can be obtained by starting from Figure 1 1 .2 again. Since X 3 (f) = 
+ H 2 {f)X 2 (f) + N(f), the cross-spectral density functions are written as 


Sn(f) = H 2 (f)S 12 (f) 

Sa(/) = H l (f)S 2l (f)+ H 2 (f)S 22 (f) (11.31) 

Solving for Hi(f) and H 2 (f) using Gaussian elimination (eliminate the term Hi(f)S 2 i(f) in 
the second Equation of (1 1.31)) gives 


Thus, 


Sn(f)Hdf) + S l2 (f)H 2 (f ) = S 13 (/) 


(w) 


S 2 i(f)Sn(f) \ 
Sn(f) ) 


H 2 (/) = S 23 (/) 


S 21 (f)S 13 (f) 

Suif) 


(11.32) 


H 2 (f) = 


523-1 (/) 
522-l(/) 


and 


Hdf) = 


Suif) 

Suif) 


Snif) S 23 l (f ) 
Suif) S 22 .\if) 


11.3 PRINCIPAL COMPONENT ANALYSIS 


Although residual spectral analysis is useful in source identification, condition monitoring, 
etc., the shortcoming of the method is that prior ranking of the input signals is often required 
(see Bendat and Piersol, 1980), i.e. a priori knowledge. Principal component analysis (PCA) 
is a general approach to explore correlation patterns (Otte et al., 1988). 

Suppose we have three processes X\ , x 2 , x 3 . Then we start as before by forming the 
cross-spectral density matrix 


S = 


Snif) 

Snif) 

Snif) 

Snif) 

Suif) 

Snif) 

Snif) 

Suif) 

Snif) 

. i.e. S = 

S* T = S H 

, where 


(11.33) 


Note that this is a Hermitian matrix, i.e. 8 = 8*' = S w , where S H is the conjugate transpose. 
If there is a linear relationship between the processes X;, then the determinant of this matrix 
is zero (i.e. its rank is less than three). If there is no linear relationship then its rank is three. 

Suppose the matrix is full rank (i.e. rank 3). Then eigenvalue (or singular value) decom- 
position gives 


S = UAIT 


(11.34) 


where A is a diagonal matrix that contains eigenvalues of S, and U is a unitary matrix whose 
columns are the corresponding eigenvectors. We may describe the physical interpretation of 
this as follows. Suppose there exist three (fictitious) processes zi, z 2 , z 3 that are mutually 
uncorrelated and from which X| , x 2 , x 3 can be derived, i.e. for each frequency /, 


Xiif)' 


muif) m l2 if) mnif) 


'Zi if)' 

X 2 (f) 

= 

m 2l if) m 22 (f) m 23 (f) 


Z 2 if) 

Xiif) 


m 3l if) m 32 if) m 33 if) 


Z 3 if) 


i.e. X(/) = M(/)Z(/). Conceptually, this can be depicted as in Figure 1 1.6. 
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Uncorrelated 

variables 



Measured 

signals 


Figure 11.6 Virtual signals and measured signals 


Then, forming the spectral density matrix gives 


Sxx(f) ~ S — M*(/)S Z z(/)M r (/) 


(11.36) 


Since the z ; are mutually uncorrelated, Szz(/) * s a diagonal matrix. Thus, Equation (1 1.36) 
has the same form as Equation ( 1 1 .34), i.e. Szz(f) = V and M*(/) = U. So, the eigenvalues 
of S are the power spectra of these fictitious processes and their (relative) magnitudes serve 
to define the principal components referred to, i.e. Zi are the principal components. 

Note however, that, it is important not to think of these as physical entities, e.g. it is quite 
possible that more than three actual independent processes combine to make up X\ , x 2 and x 3 . 
The fictitious processes z \ , zi, Z 3 are merely a convenient concept. These signals are called 
virtual signals. Note also that the power of these virtual signals is of course not the power 
of the measured signals. It is therefore interesting to establish to what degree each principal 
component contributes to the power of the measured signals. To see this, for example, consider 
X \ (/) which can be written as (from Equation (1 1.35)) 


Xi(f) = m\\{f)Z\{f) + m 12 (/)Z 2 (/) + mi 3 (/)Z 3 (/) (11.37) 


Then, since the Zi are uncorrelated the power spectral density function S XlXl (f) can be written 
as 


■W/) = Ki(/)I 2 >W/)+ I'w 12 (/)| 2 5 Z2Z2 (/)+ |m 13 (/)| 2 5 Z 3 Z3 (/) (11.38) 

and the power due to zi is y xlXl (f)S XlXl (f), where 


>&,(/) = 


|^,(/)| 2 

S ZiZl (f)S XlXl (f) 


(11.39) 


This is a virtual coherence function. More generally, the virtual coherence function between 
the ith virtual input Zi and the y'th measured signal Xj can be written as 


)&//) = 


lw /)| 2 

s ZiZl U)S xjXi (f) 


(11.40) 


Since the cross-spectral density function between Zi and Xj can be obtained by 


S ZlXj (f) = «„(/)$„,(/) 


(11.41) 
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we see that the virtual coherence function: (i) can be computed from the eigenvalues and 
eigenvectors of S: (ii) gives a measure of what proportion of S XjXj (f) comes from a particular 
component of Zi ■ 

For the details of practical applications of principal component analysis, especially for 
noise source identification problems, see Otte et al. (1988). 


Relationship to the System Identification Methods 1 

It is interesting to relate principal component analysis (PCA) to the system identification 
methods we described in Chapter 9. Let x denote a column vector of observations (e.g. x 
(input) and y (output) in Figure 9.9 in Chapter 9) with correlation matrix 

tf xx = £[xx r ] (11.42) 

Let x be derived from a set of uncorrelated processes z (through the transformation matrix T) 
by 

x = Tz (11.43) 

Then, the correlation matrix is 

R xx = TE [zz r ] T r = TR zz T t (11.44) 

Since the elements of z are uncorrelated, R ZI — A, where A is a diagonal matrix that contains 
eigenvalues of R xx . So, 

R xx = TAT r (11.45) 


This is an eigendecomposition of the correlation matrix R xx , and T is an orthogonal matrix 
whose columns are the corresponding eigenvectors, i.e. 


T = [ ti t 2 ] = 


lit hi 
hi hi 


(11.46) 


Let us apply this to the pair of variables (x (input) and y (output)) as in Figure 9.9 in 
Chapter 9. Assuming zero mean values, the correlation matrix is 


R xx = E 

The eigenvalues are 


[* y] 


E [xx] E [xy] 
E[xy\ E[yy\ 


det(f? xx - kl) = 


at - A. 


erf — A 


= 0 


(11.47) 


(11.48) 


Li , 2 = 


a X + y ± 


/ (ff x 2 - ff v 2 ) 2 + 40 - 2 , 


(11.49) 


See Tan (2005). 
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y 



The eigenvectors and t 2 corresponding to these eigenvalues are orthogonal and define a 
basis set of the data, as shown in Figure 1 1.7. 

From the figure, the slope of the eigenvector corresponding to the largest eigenvalue is 
the PCA ‘gain’ relating y to x. Using the eigenvectors T = [ti t 2 ], Equation (1 1.43) can 
be expanded as 

x — fllZl + h2Z2 

y = feizi + h 2 Z 2 (11.50) 

We note that the first principal component zi is related to the largest eigenvalue, and the part 
due to zi is tnZi f° r input x, and ? 2 iZi for output y. The gain relating y to x (corresponding to 
the first principal component zi) is then given by the ratio t 2 \lh\. The ratio can be found from 

(tfxx-MUt! =0 (11.51) 


and so 

hi °y k 2 - °yf + 4 ^ 2 y 

— (li.oz) 

Ot za xy 

We see from this that Equation (1 1.52) is the total least squares gain ( a T , see Equation (9.59)). 
This equivalence follows from the fact that both the PCA approach and TLS minimize the 
power ‘normal’ to the ‘principal’ eigenvector. 


Appendix A 

oo 

Proof of / 2M ™^“" da = 1 

— OO 


We first consider the contour integration of a function F(z) = e Jz f(z) = e JZ /z around a closed 
contour in the z -plane as shown in Figure A. 1, where z = x + jy. 



Figure A.l A contour with a single pole at z = 0 


Using Cauchy’s residue theorem, the contour integral becomes 

-p . R . 

/ e JZ f e JZ ( e ]x C e JZ t e ]x 

— dz — / — dz+ / dx + / — dz + / — dx = 0 (A.l) 

Z Jc R z J X Jc Z J X 

-R P 


From Jordan’s lemma, the first integral on the right of the first equality is zero if R — > oo, i.e. 
limy^oc/^ e* z f{z)dz = 0. Letting z = pe- ,s and dz = jpe J0 dd, where 8 varies from n to 
0, the third integral can be written as 




J(pe ie ) 


d8 


(A.2) 
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Taking the limit as p — > 0, this becomes 

o 


lim 

p^O 


j J e^d6 = j8\l = -jn 


Now, consider the second and fourth integral together: 


p . k -p 

r eJ x r e JX f 

I T dx+ ) -V dx = l 


cos x + j sin x 


dx + 


IK 

/ 


cos x + j sin x 


dx 


(A.3) 


(A.4) 


Since cos(x)/x is odd, the cosine terms cancel in the resulting integration. Thus, Equation 
(A.4) becomes 


-p R R 

f e jx f e ix f sinx 

/ dx + / — dx=2j / dx 

J x J x J x 


-R P P 

Combining the above results, for R — »■ oo and p — > 0, Equation (A.l) reduces to 

R 1 oo 


lim 


2 j 


f sinx I f 

j— dx \= 2j l 


-dx = jn 


Thus, we have the following result: 


OO 

/ 


sin x n 

dx = — 

x 2 


We now go back to our problem. We have written 

sin 2 itaM 

lim 2 M = 5(a) 

M—poo 2j toM 

in Chapter 3. In order to justify this, the integral of the function 

sin 2naM 
f{a) = 2M- 


2 Ti aM 


(A.5) 


(A.6) 


(A. 7) 


must be unity. We verify this using the above result. Letting x = 2naM and dx = 2 tt M da, 
we have 


OO oc 

J f(a)da = / 2 M- 


_ sin 2 ti aM 
2 ttoM 


-da = / 2 M 


/ 


sinx dx 1 C sinx 


x 2jrM 7 x 


I 


-dx 


(A. 8) 


/ 


OO 

sinx C sinx 

dx = 21 dx = ti 

x J x 

o 


From Equation (A. 7), 
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thus Equation (A. 8) becomes 


/ 


2 M 


sin 2: xaM 

da = 1 

2: xaM 


— OO 


This proves that the integral of the function in Figure A. 2 (i.e. Figure 3.1 1) is unity. 


(A.9) 



Figure A.2 Representation of the delta function using a sine function 


Appendix B 

Proof of \S xy (f )\ 2 < S xx (f)Syy(f) 


Suppose we have Z T {f) consisting of two quantities X T (f) and Y T (f) such that 


Z T (f) = uiX T (f) + u 2 YAf) (B.l) 

where u t are arbitrary complex constants. Then the power spectral density function S zz (f) 
can be written as 


E[Z*(f)Z T (f)] * 

S zz (f) = lim rW ' J = u\S xx (f)u, + u* 2 Sy X (f) Ul + u\S xy (f)u 2 + u*S yy (f)u 2 

T — >oo 1 


'Sxx(f) 

SxyUY 

U\ 

.Syx(f) 

Syy(f). 

_ u 2 _ 


= u H Su 


(B.2) 


where S is the cross-spectral density matrix. 

S is a Hermitian matrix. Moreover, this cross-spectral density matrix is positive semi- 
definite, i.e. for all non-zero (complex) vectors u, u w Su > 0 since the power spectral density 
function S zz {f) is non-negative for all frequencies /. 

Since the matrix S is positive semi-definite, its determinant must be non-negative, i.e. 


Sxxif) 

SyAf) 


S X y(f) 

Syy(f) 


> o 


or S XX (f )Syyif) - S Xy (J)Sy X (J) > 0, i.C. 


S X y(f)Sy X (f) < S XX (f)Syy(f) 


(B.3) 


(B.4) 


Since S yx (f) = S xy (J), it follows that 

\S X y(f )\ 2 < Sxx(f)Syy(f) (B.5) 

Note that it can easily be verified that a multi-dimensional cross-spectral density matrix S 
(say, n x n) is also a positive semi-definite Hermitian matrix. 
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Wave Number Spectra 
and An Application 


Rather than a ‘time series’ we may consider a function of ‘space’. This might be demonstrated 
with height indications on a rough road, for example, as shown in Figure C.l. 

z(x) 

X 

X\ ^ x 2 

Figure C.l Height profile of a rough road 



If the process is ‘spatially stationary’ (homogeneous) we may characterize it by the 
autocorrelation function 


R zz (x 2 - xi) = E[z(,xi)z{x 2 y\ (C.l) 

or R zz (t;) = E[z(x i)z(x\ + §)], where § = x 2 — xi which is the spatial separation of the two 
points. 

Now we shall consider a spectral analysis of the process. If the independent variable is 
time then we speak of a> (rad/s). Here we shall use k (rad/m), and this is called the wave 
number. Note that a> = 2jr / T shows how fast it oscillates (in radians) in a second, while 
k = 2it /X represents how many cycles (in radians) of the wave are in a metre, where X is the 
wavelength. Then, the wave number spectrum is defined as 

OO 

S zz (k) = J R zz (Z)e-^dt (C.2) 
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Compare this expression with the usual spectral density function, S xx (a>) = f'f R xx ( x) 
e~ Jon dx . We note that period T — 2n /a> is now replaced by wavelength X = 2 iz/k. (See 
Newland (1984) for more details.) 


Application 

Consider a vehicle moving over rough ground at speed V as shown in Figure C.2. The equation 
of motion for this simple model is my(t) = —k [y(t) — z(t)] — c [y(0 — z(f)], so that 

my(t) + cy(t) + ky(t) = cz(t) + kz(t) (C.3) 

The problem we are concerned with is: given a specific road property R zz (t;), calculate the 
value of the variance of y(t) as the vehicle moves over the ground at constant speed V. 



Figure C.2 A vehicle moving over rough ground at speed V 


If we treat z(t) as a stationary random variable, the variance of y(t) can be written as (we 
assume y(t) has a zero mean value) 

OO OO 

E[y 2 {t)} = — f S yy {co)d(D = ' f \H(m)\ 2 S zz {co)dw (C.4) 

2 nj 2n J 

— OO — OO 

where the system frequency response function is 

k + jcco 


H{a>) = 


k — mm 2 + jca> 

Now, it remains to obtain S zz (a>) while we only have S zz (k) at present, i.e. we must interpret 
a wave number spectrum as a frequency spectrum. We do this as follows. First, we convert a 
temporal autocorrelation to a spatial one as 

R zz (r) = E [z{t)z(t + t)] = E [z(x(t))z(x(t + r))] = E [z{x)z(x + Vr)] = R zz (Vr) 

(C.5) 

and so 


OO OO 

S zz ((o) = J R zz {x)e- iaiz dr = J R zz (VT)e- ja,r dT 

— OO — OO 

Letting Vx = £ , this can be rewritten as 

OO 

Szzi to) = y f R^e-M^di; = i S zz (k)\ k=a/V 


(C.6) 


(C.7) 
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Thus, to obtain the frequency spectrum from the wave number spectrum we simply replace k 
by coj V and divide by V. Note that the speed can be expressed by 


co co 

V = fX = — X = - 
2n k 

For a simple example, if the road property is R zz (£j) = e -0 - 2 ^ 1 then the wave number 
spectrum and the power spectral density function are 

0.4 0.4V 


S zz (k) = 


0.04 + k 2 


and S zz (a>) = 


0.04V 2 + co 2 


respectively. 


Appendix D 

Some Comments on the Ordinary 
Coherence Function 7 l y {f) 


The Use of the Ordinary Coherence Function 

If we wish to estimate the transfer function linking two signals from S xy (f) = H(f)S xx (f), 
i.e. by forming the ratio H(f) = S xy (f)/ S xx (f), then we may compute the coherence function 
which is a direct measure of the ‘validity’ of this relationship. That is, if y xy (f) & 1 the transfer 
function H(f) is well estimated; if y xy (f) is low, the estimate of H(f) is not trustworthy. 

Also, this concept can be applied to multiple-source problems. For example, let x(t) and 
y{t) be the input and the output, and suppose there is another input z(t) which we have not 
accounted for and it contributes to y(t) as shown in Figure D.l. 



y(t), output 


Figure D.l Multiple-source problems 

If z(t) is uncorrelated with x(t), then its effect on the coherence function between x(t) 
and y{t) is the same as the measurement noise n y (t) as in Case (a), Section 9.2. 

The Use of the Concept of Coherent Output Power 

Consider the experiment depicted in Figure D.2. A measurement y,„(t) is made of sound from 
a plate being shaken with the addition of background noise n(t) from a speaker, i.e. 

y m (t) = y(t) + n(t) (D.l) 

where y(t) is the sound due to the plate and n(t) is the noise due to the speaker. 
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Speaker, nil) 

Mic. 

Figure D.2 Measurements of acoustic pressure resulting from a vibrating plate 

From the coherence measurement y ^ (/) and the power spectral density S ym y m (f) we 
can calculate the power at the microphone due to the plate by 

Syy(f) = y xym (f)Sy m y m (f) (D.2) 

This is only satisfactory if x(t) is a ‘good’ measurement of the basic source, i.e. the vibration 
signal x(t) must be closely related to the radiated sound y(t). An example where this might 
not be so is as shown in Figure D.3. 

Accelerometer, x(t) 

Motor/blower 

Figure D.3 Measurements of acoustic pressure due to a motor/blower 

Now x(t ) will not be a good measurement of the primary noise source in general, i.e. the 
accelerometer will not measure the aerodynamic noise. 


Speaker, n{t) 
t ^ 


t 


Mic.,;' m (0 


Accelerometer, x{t) 



Appendix E 

Least Squares Optimization: 
Complex- Valued Problem 


Consider the least squares problem (Case 1 in Section 9.3) which finds the optimal parameter 
ci\ that fits the data such that y = a \x, where the objective function is given by 

1 N 

j\ = T 7 ^ (yi - aiXif 

™ i= 1 

If x and y are complex valued, then we may find an optimal complex parameter a\ , where the 
objective function is 

^ jv i N 

J\ = — ^2 1 y> - aiX ‘\ 2 = ( y * ~ a * x D(yi - °i *<) (E.i) 

i= 1 i'=l 

Let Xj = x iR + jXj i, y,' = y,-, R + yy,j and ai = a R + ja\. Then Equation (E.l) can be 
written as 

1 N 

•A — T 7 V [(.L\r + yf,i) ~ 2a R (x, >R y iiR + Xjjy,-, i) 

i=l 

+ 2fli(x,-,iyi,R - x iiR y ; ,i) + (a| + af) (x 2 R + x 2 ,)] (E.2) 

This is a real quantity. To minimize J\ with respect to both a R and we solve the following 
equations: 


S J i A 

^ = — V [-2(x /iR y f , R + Xj iy,- 1 ) + 2u R (x, 2 R + x 2 j)] = 0 
3fl R N ’ ’ 

d J i A 

g^- = ^ - *i,R.V/.i) + 2 ai ( x 2 r + x 2 : )] = 0 

* 1 = 1 


(E.3) 
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Solving these equation gives 

N 

E (Ai.RTi.R + 


N 

E te.Rji.I - r) 


flR = 


and ai = 


E VI r + x h) 

1=1 


E (■*,- R + xf, i) 
1 = 1 


Thus the optimal complex parameter a\ can be written as 


a\ = aR + ja\ = 


N N 

E [(•*;, R^i.R + Xi, iy,,i) + ;(■*;, Ry;,i - JCi.iyt.R)] E x *yi 

i=l i=l 

E (* 2 R + x < 2 i) El*; I 2 

i=l i=l 


(E.4) 


Similarly, the complex form of 02 and aj (Case 2 and Case 3 in Section 9.3) can be found as 


N 

Ely; I 2 

^ (E.5) 

E y?x t 


Clj 


(s 


lKl 2 -Ekl ! | + ,/|Ekl 2 - 


N \ 2 

Eh; I 2 ) 


\ 2 

N 

) +4 

E *f yt 

i=l 


N 

2 E yfxi 

i=l 


(E.6) 


Note the location of conjugates in the above equations, and compare with the frequency 
response function estimators Hi(f) and H T (f) given in Section 9.3. 


Appendix F 

Proof of H w (f) — > Hi(f) as «(/) — + oo 


We start from Equation (9.67), i.e. 

S ym yM) ~ «{f)~S v Sf) + ~ Sy.fi.lf)] 2 * + 4 I" *(/) 


H w (f) = 


(P-1) 


2 S ymX Jf) 

Let ic(f) — 1 /e; then the right hand side of the equation can be written as 

f(e) _ ~Sy„yJf)e - S Wm (f) + J~S 2 Vm (f) - 2~S XmXm {f)~S ymym (f)e + S^J/je 2 + 4 \S Xmym (f)\ 2 e 


g(e) 


2 S ymX Jf)e 


(F.2) 


Now, taking the limit e — >■ 0 (instead of k — > oo) and applying L’Hopital’s rule, i.e. 

.. m /'(e) 
urn = lim 

g(e) S^o g'(e) 


we obtain 


/'(e) Kj . (/) + s (/)) _1/2 (— 2 Si„i„ (/)S^ (/) + 4|S w „(/)| 2 ) 

lim = 

g'(e) 2S VmJtm (/) 

_ Sv mJm (/)-S y „ Jm (/) + 2 (S w „,(/))^|S , mVm (/)| 2 _ ( 5 W „(/)) _1 |.W/)! : 


Sx m x m (/) 

This proves the result. 


2Sy m x m (/) 

s: m y m (ns xm y m (f) _ Sy m x m (f)S Im y m (f) 
^X m x m (f)Sym* m (f) ^X m x m (/)«,„,„ (/) 

v m 

= #■(/) 


SymXm(f) 


(F.3) 
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Justification of the Joint Gaussianity 

of X(f) 


If two random variables X and ¥ are jointly Gaussian, then the individual distribution remains 
Gaussian under the coordinate rotation. This may be seen from Figure G.l. For example, if 
X' and Y' are obtained by 



cos (p 
sin tp 


— sin 0 
cos cp 


X 

Y 


(G.l) 


then they are still normally distributed. For a complex variable, e.g. Z = X + jY , the equiv- 
alent rotation is e^Z. If two random variables are Gaussian (individually) but not jointly 
Gaussian, then this property does not hold. An example of this is illustrated in Figure G.2. 


p(x,y) 



Figure G.l Two random variables are jointly normally distributed 

Now, consider a Gaussian process x(t). The Fourier transform of x(t ) can be written as 

X(f) = X c {f) + (G.2) 
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p(x,y) 



Figure G.2 Each random variable is normally distributed, but not jointly 


where X c (f) and X s (/) are Gaussian since x{t) is Gaussian. If these are jointly Gaussian, they 
must remain Gaussian under the coordinate rotation (for any rotation angle <p). For example, 
if e^X(f) = X'(f) = X' c (f) + jX' s (f) then X'(/) and X' s (f) must be Gaussian, where 
X' c {f) = X c (f) cos cp - X s (f) sirup and X' s (f) = X c {f) sirup + X s (f) cos ip. 

For a particular frequency /, let (p = —2nfto. Then e~ y2lr ^'°X(/) is a pure delay, i.e. 
x(t — to) in the time domain for that frequency component. If we assume that x(t) is a sta- 
tionary Gaussian process, then x(t — tg) is also Gaussian, so both X' c {f) and X' s (f) remain 
Gaussian under the coordinate rotation. This justifies that X c (f) and X s {f) are jointly normally 
distributed. 


Appendix H 

Some Comments on Digital Filtering 


We shall briefly introduce some terminology and methods of digital filtering that may be 
useful. There are many good texts on this subject: for example, Childers and Durling (1975), 
Oppenheim and Schafer (1975), Oppenheim etal. (1999) and Rabiner and Gold (1975). Also, 
sound and vibration engineers may find some useful concepts in White and Hammond (2004) 
together with some other advanced topics in signal processing. 

The reason for including this subject is because we have used some digital filtering 
techniques through various MATLAB examples, and also introduced some basic concepts in 
Chapter 6 when we discussed a digital LTI system, i.e. the input-output relationship for a 
digital system that can be expressed by 

N M 

y(n) = - ^2,a k y(n - k) + ^2,b r x(n - r) (H.l) 

k = 1 r = 0 

where x(n) denotes an input sequence and y(n) the output sequence. This difference equation 
is the general form of a digital filter which can easily be programmed to produce an output 
sequence for a given input sequence. The z-transform may be used to solve this equation and 
to find the transfer function which is given by 


H{z) = 


y (z) 

X(Z) 


M 

E b rZ~ r 

r = 0 


1 + E a kZ k 

k = 1 


(H.2) 


By appropriate choice of the coefficients a*, and b r and the orders N and M, the characteristics 
of H(z) can be adjusted to some desired form. Note that, since we are using a finite word 
length in the computation, the coefficients cannot be represented exactly. This will introduce 
some arithmetic round-off error. 


Fundamentals of Signal Processing for Sound and Vibration Engineers 
K. Shin and J. K. Hammond. © 2008 John Wiley & Sons, Ltd 


394 


APPENDIX H 


In the above equations, if at least one of coefficients a* is not zero the filter is said to be 
recursive, while it is non-recursive if all the coefficients a % are zero. If the filter has a finite 
memory then it is called an FIR (Finite Impulse Response) filter, i.e. the impulse response 
sequence has a finite length. Conversely, an HR (Infinite Impulse Response) filter has an 
infinite memory. Note that the terms ‘recursive’ and ‘non-recursive’ do not refer to whether 
the memory is finite or infinite, but describe how the filter is realized. However, in general, 
the usual implementation is that FIR filters are non-recursive and HR filters recursive. 

There are many methods of designing both types of filters. A popular procedure for 
designing IIR digital filters is the discretization of some well-known analogue filters. One of 
the methods of discretization is the ‘impulse-invariant’ method that creates a filter such that 
its impulse response sequence matches the impulse response function of the corresponding 
analogue filter (see Figure 5.6 for mapping from the s-plane to z-plane). It is simple and easy 
to understand, but high-pass and band-stop filters cannot be designed by this method. It also 
suffers from aliasing problems. Another discretization method, probably more widely used, is 
the ‘bilinear mapping’ method, which avoids aliasing. However, it introduces some frequency 
distortion (more distortion towards high frequencies) which must be compensated for (the 
technique for the compensation is called ‘prewarping’). 

FIR filters are often preferably used since they are always stable and have linear phase 
characteristics (i.e. no phase distortion). The main disadvantage compared with IIR filters 
is that the number of filter coefficients must be large enough to achieve adequate cut-off 
characteristics. There are three basic methods to design FIR filters: the window method, the 
frequency sampling method and the optimal filter design method. The window method designs 
a digital filter in the form of a Fourier series which is then truncated. The truncation introduces 
distortion in the frequency domain which can be reduced by modifying the Fourier coefficients 
using windowing techniques. The frequency sampling method specifies the filter in terms of 
H(k), where H{k) is DFT[/i(n)]. This method is particularly attractive when designing narrow- 
band frequency-selective filters. The principle of optimal filter design is to minimize the mean 
square error between the desired filter characteristic and the transfer function of the filter. 

Finally, we note that IIR filters introduce phase distortion. This is an inevitable con- 
sequence of their structure. However, if the measured data can be stored, then ‘zero-phase’ 
filtering can be achieved by using the concept of ‘reverse time’. This is done by filtering the 
data ‘forward’ and then ‘backward’ with the same filter as shown in Figure H.l. 



Figure H.l Zero-phase digital filtering 


The basic point of this scheme is that the reverse time processing of data ‘undoes’ the 
delays of forward time processing. This zero-phase filtering is a simple and effective procedure, 
though there is one thing to note: namely, the ‘starting transients" at each end of the data. 
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Aliasing, 123, 126-128, 140-144, 181 
All-pass filter, see Filter, all-pass 
Amplitude Modulation, see Modulation, amplitude 
Analogue-to-digital conversion, 131-134 
Analogue-to-digital converter (ADC), 130, 131 
Analytic signal, 9 1 

Anti-aliasing filter, see Filter, anti-aliasing 
Anti-imaging filter, see Filter, reconstruction 
Autocorrelation coefficient, 225, 228 
Autocorrelation function, 225-227, 23 1 
computational form, 23 1 , 325 
estimator, 323 

examples, 234-240, 255-258, 259-261, 274 

properties, 228 

sine wave, 234-235, 255-256 

square wave, 238-239 

time delay problem, 237-238, 256-258 

transient signal, 239-240 

via FFT, 325-326 

white noise, 236 

Autocovariance function, see Autocorrelation function 

Auto-regressive (AR), 149 

Auto-regressive moving average (ARMA), 149 

Band-limited, 128. See also White noise, band-limited 
Bandwidth 
3 dB, 99, 329, 332 
noise, 99, 100, 332 
resolution, 340, 341, 342, 344 
Bandwidth- time (BT) product, see Uncertainty principle 
Bias, 94, 318. See also Error; Estimator errors 
Bivariate, 201, 205, 206 


Bounded input/bounded output (BIBO) stable, 87, 149 
Butterworth, see Filter, low-pass 

Causal, 75, 76, 147 
Central limit theorem, 205, 213-214 
Central moment, see Moment, central 
Cepstral analysis, 73 
Chebychev, see Filter, low-pass 
Chi-squared (x^) distribution, 335-336 
Coherence function, 284-287, 385 
effect of measurement noise, 285-287 
estimator, 349 
multiple, 368 
partial, 367 
virtual, 371 

Coherent output power, 286, 385 
Confidence interval, 319 
spectral estimates, 345-347 
Conjugate symmetry property, see Symmetry property 
Convolution, 3, 75-77, 147-148, 182-183 
circular, see Convolution, periodic 
fast, 164 
integral, 75-76 
linear, see Convolution, sum 
periodic, 162-163, 182-183 
sequences, see Convolution, sum 
sum, 147-148, 164, 170-171, 182-183 
Correlation, 206 
coefficient, 206, 215-216 
Correlation function, see Autocorrelation function; 

Cross-correlation function 
Cost function, see Objective function 
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Covariance, 206 

Covariance function, see Autocorrelation function; 

Cross-correlation function 
Cross-correlation coefficient, 228 
Cross-correlation function, 227-228, 231 
computational form, 23 1 , 325 
estimator, 324 

examples, 240-242, 258-266, 273-274 
properties, 228-229 
time delay problem, 241-242, 261-266 
Cross-covariance function, see Cross-correlation 
function 

Cross-spectral density function, 247 
estimator, 292, 347, 348 
examples, 249-251, 266-275 
properties, 247-249 
raw, 347, 348 
smoothed, 347, 348 
time delay problem, 250-25 1 
Cumulative distribution function, see Distribution 
function 

Cut-off frequency, 129, 130 

Data truncation, 94-96, 109-114, 155-156, 158-160, 
171-174. See also Fourier series, computational 
consideration 
Data validation, 136 
Decimation in time (DIT), 165 
Deconvolution, see Cepstral analysis 
Degrees of freedom, 335, 340, 344, 345, 346 
Delta function, 38-39, See also Impulse 
Dirac delta, 38 

Fourier transform, see Fourier integral, Dirac delta; 

Discrete Fourier Transform, Kronecker delta 
Kronecker delta, 146 
properties, 39-40 

Deterministic, see Signal, deterministic 
Digital filter, see Filter, digital 
Digital-to-analogue converter (DAC), 135, 139 
Discrete Fourier transform (DFT), 50, 153-155, 

156 

inverse (IDFT), 51, 154 
Kronecker delta, 160 
properties, 160-161 
scaling effect, 158-160 

Dirichlet conditions, see Fourier series, convergence 
Dispersion, see Group delay 
Distribution function, 199, 200 
Dynamic range, 130, 133, 134 

Echo, 72-73, 103-104 
Ensemble, 220 
Ensemble average, 223-224 

autocorrelation function, 226-227, 255-256 
probability density function, 253-254 


Envelope analysis, 91 
Ergodic, 229 

Error, see also Estimator errors 
bias error, 319 
random error, 319, 352 
RMS error, 319 
Estimator errors, 317-320 

autocorrelation function, 323-324 
coherence function, 349-350, 358-360 
cross-correlation function, 324-325 
cross-spectral density function, 348-349, 354-358, 
360-362 

frequency response function, 351-352 
mean square value, 321-322 
mean value, 320-321 

power spectral density function, 327-330, 334-337, 
339-342, 343-344, 345, 354-358 
table, 352 

Even function, 37, 44, 59 
Expectation, 202 

Expected value, 203. See also Ensemble average 
Experiment of chance, 194 
Event, 194 

algebra, 194-195 
equally likely, 194, 196 

Fast Fourier transform (FFT), 164-166. See also 
Discrete Fourier transform 
Filter 

all-pass, 85 

anti-aliasing, 128-131, 143 
band-pass, 82 
constant bandwidth, 330 
constant percentage bandwidth, 331 
digital, 148, 393-394 
low-pass, 82, 129 
octave, 331 
reconstruction, 139 
third (1/3) octave, 331 
Filter bank method, 327. See also Power 
spectral density function, estimation 
methods 

Finite Impulse Response (FIR), 265, 394 
Folding frequency, 127 
Fourier integral, 57-61 
Dirac delta, 62 
examples, 62-67 
Gaussian pulse, 66 
inversion, 60-61 
pair, 59, 60 
periodic function, 67 
properties, 67-71 
rectangular pulse, 64 
sine function, 63 
table, 68 
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Fourier series, 31-34, 41, 42^43. See also Fourier 
transform 

complex form, 42-43 

computational consideration, 46^-8, 49, 54-56. 

See also Data truncation 
convergence, 36 
even function, 38 
odd function, 38 
rectangular pulse, 44-45 
relationship with DFT, 48, 50-51 
square wave, 34-36, 49 
Fourier transform 

continuous signal, see Fourier integral 
convolution, 70, 152 
descrete-time, 152 
differentiation, 70 

discrete, see Discrete Fourier transform 
product, 7 1 

properties, see Fourier integral, properties; Discrete 
Fourier transform, properties 
sampled sequence, 121, 153 
summary, 168-169 
train of delta functions, 122 
Frequency domain, 20 

Frequency modulation, see Modulation, frequency 
Frequency response function (FRF), see also System 
identification 

biasing effect of noise, 294-295, 307-312 
continuous system, 4, 77-78 
curve fitting, 311-313 
descrete (digital) system, 150 
estimator Hu 6 , 184, 293, 350 
estimator H 2 , 6 , 293 

estimator H 3 , see System identification, effect of 
feedback 

estimator Hj , 6 , 294 
estimator Hw , 293 

Frequency smoothing, 345. See also Power spectral 
density function, estimation methods 

Gaussian pulse, see Fourier integral, Gaussian pulse 
Gaussian, see Probability distribution, Gaussian 
Gibbs’ phenomenon, 36, 52-53 
Group delay, 72, 82-85, 104-105 
Group velocity, 84 

Hilbert transform, 90-93, 106-109 

Impulse-invariant, 125, 148 
Impulse, see Delta function 
Impulse response 
continuous, 75 
discrete, 147 

Impulse train, 41, 42, 119, 120 
Independent, see Statistically independent 


Infinite Impulse Response (HR), 125, 394 
Instantaneous amplitude, 91 
Instantaneous frequency, 9 1 
Instantaneous phase, 91 
Inverse spreading property, 63, 64, 101 

Kurtosis, 208, 216-218. See also Moment 
computational form, 210 

Laplace transform, 78, 124. See also z-transform 
sampled function, 124, 125 
Leakage, 94, 95 

Least squares, 289. See also Total least squares 
complex valued problem, 387-388 
Leptokurtic, see Kurtosis 
Linearity, 74 

Linear phase, see Pure delay 
Linear time-invariant (LTI) system, 73 
continuous, 73-81 
discrete, 147, 149-150 
examples, 78-81 

Matched filter, 263 

Mean square error, 319. See also Estimator errors 
Mean square value, 204, 222, 230, 321. See also 
Moment 

computational form, 230 
Mean value, 32, 203, 222, 230, 278, 317, 321. 

See also Moment 
computational form, 209, 230 
sample mean, see Mean value, computational 
form 

Minimum phase, 87-90 
Modulation 
amplitude, 70, 84, 91 
frequency, 93 

Moment, 203-204, 206, 207-210, 222-223 
central, 204 

computational consideration, 209-210 
properties, 207 
summary, 211 
Moving average (MA), 149 

Multiple-input and multiple-output (MIMO) system, 
363 

Mutually exclusive, 195, 196 
Noise power, 286 

Non-stationary, 224. See also Signals, non-starionary 
Nyquist diagram, see Polar diagram 
Nyquist frequency, 127 
Nyquist rate, 127 

Objective function, 289 
Odd function, 37, 44, 59 
Optimisation, 5. See also Least squares 
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Ordinary coherence function, see Coherence function 
Orthogonal, 33, 43, 206 
Overshoot, see Gibbs’ phenomenon 

Parseval’s theorem, 45, 61 
Passband, 129 
Periodogram, 337 
modified, 344 
raw, 337, 343 
Phase delay, 83, 84, 105 
Phase velocity, 84 
Platykurtic, see Kurtosis 
Poisson process, 235 
Polar diagram, 60 

Power spectral density function, 242-245, 327-347 
estimation methods, 327-345 
estimator, 292, 328, 337-338, 343, 345 
examples, 245-246, 270-275 
raw, 243, 333, 334, 343 
smoothed, 328, 337-338, 343, 345 
Principal component analysis (PCA), 370-372, 373 
Probability, 194 
algebra, 196 
conditional, 197 
joint, 196 

Probability density function, 200-201, 220-222 
chi-squared, 335 
Gaussian bivariate, 205 
joint, 202 , 222 
marginal, 202 

sine wave, 232-233, 253-254 
Probability distribution, 199. See also Distribution 
function 
Gaussian, 205 
jointly Gaussian, 391-392 
normal, see Probability distribution, Gaussian 
Rayleigh, 204 
standard normal, 205 
uniform, 133, 204 

Pure delay, 72. See also Group delay; Phase delay 

Quantization 11, 131, 132 
error, see Quantization, noise 
noise, 132 

Random, 8 , 193. See also Signal, random 
Random error, see Error; Estimator errors 
Random variable, 198 
continuous, 199 
discrete, 199 
residual, 366 

time-dependent, see Stochastic process 
Range space, 198 

Reconstruction filter, see Filter, reconstruction 
Relative frequency, 197, 212-213 


Resolution, 157, 174-175. See also Data truncation 
Root mean square (RMS), 204. See also Moment 

Sample space, 194 
Sampling, 119, 131 
Sampling rate, 120, 127, 131 
Sampling theorem, 137-139 
Schwartz’s inequality, 101 
Segment averaging, 275, 342-345. See also Power 
spectral density function, estimation methods 
Shannon’s sampling theorem, see Sampling theorem 
Skewness, 207, 208. See also Moment 
computational form, 210 

Sifting property, 39. See also Delta function, properties 
Signal, 6-14, 15, 16, 19-29 
almost periodic, 10, 12, 21-24, 28-29 
analogue, see Signal, continuous 
classification, 7 
clipped, 15 
continuous, 6 
deterministic, 7, 8 , 10, 19 
digital, see Signal, discrete 
discrete, 6 

low dynamic range, 14 
non-deterministic, see Signal, random 
non-stationary, 13 
periodic with noise, 13 
periodic, 12, 19-21, 26-27, 31 
random, 8 , 1 1 
square wave, 34 
transient with noise, 16 
transient, 10, 16, 24, 25 
Signal conditioning, 1 34 
Signal reconstruction, see Sampling theorem 
Signal-to-noise ratio (SNR), 133 
Sine function, 40, 41, 64, 138 
Smearing, 94 

Spectra, see Spectrum; Spectral density; Power spectral 
density function; Cross-spectral density function 
Spectral density, see also Power spectral density 
function; Cross-spectral density function 
coincident, 248 
energy, 62 
quadrature, 248 
matrix, 364, 370, 379 
residual, 367, 369 

Spectrum, 43-46. See also Power spectral density 
function; Cross-spectral density function 
amplitude, 44, 59, 247 
line, 44 

magnitude, see Spectrum, amplitude 
phase, 44, 59, 247 
power, 45-46 

Stability, see Bounded input/bounded output (BIBO) 
stable 
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Standard deviation, 204, 209 
Stationary, 9, 11, 224 

Statistical degrees of freedom, 346. See also Degrees of 
freedom 

Statistically independent, 197, 206 
Stochastic process, 220 
Stopband, 129 

Symmetry property, 69, 159, 161, 175-177 
System identification, 3-6, 183-190, 251, 

287-297 

effect of feedback, 296-297 

effect of noise, 294-295 

examples, 183-190, 270-275, 298-315 

Time average, 229-231. See also Ensemble average; 
Ergodic 

autocorrelation function, 234, 255-256 
probability density function, 233, 253-254 
Time invariance, 74 
Time series analysis, 8 
Time shifting, 69. See also Pure delay 
Total least squares (TLS), 290, 373 
Transfer function 
continuous system, 78 
discrete system, 149 

Transmission paths identification, 303-307 

Uncertainty, 4 
noise, 4-6, 14 

Uncertainty principle, 100-101 


Unit step function, 66 
Unit step sequence, 146 
Univariate, 201 

Variance, 204, 209, 223, 231, 318. See also Moment; 
Estimator errors 
computational form, 209, 23 1 

Wave number spectra, 381 
Welch method, see Segment averaging method 
White noise, 236, 245, 281 
band-limited, 246 

Wiener-Khinchin theorem, 244, 247, 334 
Window, 94, 96-100 
Bartlett, 98, 339 
Hamming, 98, 339 

Hann (Hanning), 96, 98, 1 1 1, 1 12-1 17, 339 
lag, 337 
Parzen, 98, 339 

rectangular, 94, 97, 109, 112-117, 339 
spectral, 94, 337, 341 
table, 100, 338, 341 
Tukey, 96 

Windowing, see Data truncation 
z-transform, 123-124 

relationship with the Laplace transform, 124-126 
Zero-order hold, 1 39 
Zero padding, 110, 157, 178 
Zero phase filtering, 260, 394 


